Python and pip supply chain issues (1)
An attacker identifies requests, a highly popular Python library, as a target. They create and publish a malicious package to the public Python Package Index (PyPI) named reqeusts—a common misspelling.
This malicious reqeusts package has a setup.py file containing a post-install script. When a developer or a CI/CD system runs pip install reqeusts, this script executes automatically. The script scans the local environment for environment variables like AWS_ACCESS_KEY_ID, AWS_SECRET_ACCESS_KEY, and GITHUB_TOKEN. It also looks for SSH keys in the ~/.ssh/ directory.
The attacker’s goal is to trick a developer into making a typo while manually editing a requirements.txt file or to have the package pulled in by an unpinned dependency that drifts over time. Once the malicious package is installed, the post-install script exfiltrates any discovered credentials via an encoded HTTP POST request to an attacker-controlled server. The compromise is silent and happens in seconds, stealing secrets from either a developer’s machine or, more critically, from the production-adjacent environment of a CI/CD pipeline.
Reconnaissance
Explanation
This is the attacker’s planning phase. For this scenario, they aren’t targeting your company specifically; they are playing a numbers game against the entire Python ecosystem. They use automated tools to scan PyPI for popular packages and generate a list of plausible typos or name variations (e.g., reqeusts for requests, python-dateutils for python-dateutil). They then check if these names are available on PyPI. Their script is generic—it hunts for common credential formats and environment variables, maximizing the potential reward from any victim.
Insight
Attackers exploit the openness of public repositories. They know developers are under pressure and that typos are common. They also understand that many projects don’t perfectly manage their dependencies, creating opportunities for a malicious package to slip in during a routine pip install. The attacker’s cost is nearly zero, but the potential payoff from stealing cloud or source control credentials is huge.
Practical
Review your projects on public code-hosting sites like GitHub. If your requirements.txt files are public, you are providing attackers with a list of dependencies they can target with typosquatting. While keeping this private isn’t a complete defense, it avoids making your project an easy-to-find target.
Tools/Techniques
- PyPI: Attackers search for top packages to impersonate.
- GitHub Search: Attackers can search for
filename:requirements.txtto find potential targets and see which libraries are most commonly used. - pypistats.org: Used to analyze download trends and identify popular packages that are good candidates for typosquatting.
Metrics/Signal
- This stage is performed by the attacker, so there are no direct defensive metrics. However, an increase in typosquatted packages being reported in the security community can be a signal of heightened attacker activity in the ecosystem.
Evaluation
Explanation
This stage is about looking in the mirror. How vulnerable are your development practices to this attack? A developer can assess their environment by asking key questions:
- Do we manually type package names into
requirements.txt? - Are the versions in our
requirements.txtfiles “pinned” to a specific version (e.g.,requests==2.28.1), or are they open-ended (e.g.,requests)? - Do we use a lock file that includes cryptographic hashes for each package?
- Does our CI/CD pipeline pull dependencies directly from the public internet without any checks?
- Do developers on our team store long-lived, high-privilege credentials on their local machines?
Insight
The most significant vulnerability is often not a complex technical flaw but a weak process. The gap between a developer’s local dependency list and what the CI/CD system installs is a common source of “works on my machine” bugs and, in this case, a critical security risk. Inconsistency in how dependencies are added and updated across a team creates the exact opening this attack exploits.
Practical
Perform a quick audit. Grab a few of your team’s projects and examine the requirements.txt file. Is it pinned? Now check for a requirements.lock or poetry.lock file. If one doesn’t exist, your project is vulnerable to requirements.txt drift. Review your CI/CD pipeline configuration (.github/workflows/main.yml, .gitlab-ci.yml, etc.). Find the line that runs pip install and ask: what controls are wrapped around this command?
Tools/Techniques
- Manual code review: Simply reading your project’s dependency files is the first and best step.
grepor similar search tools can be used to scan an entire codebase for unpinned dependencies inrequirements.txtfiles.
Metrics/Signal
- Percentage of projects that use dependency lock files.
- The number of unpinned dependencies discovered across all project repositories.
- Time taken for a new developer to set up a project; a long and manual setup process often indicates inconsistent dependency management.
Fortify
Explanation
Fortification is about building defenses to prevent the malicious package from ever being installed. The most effective defense is to eliminate ambiguity in your dependencies. Instead of just specifying a package name, you should specify the exact version and a hash of the package contents. This ensures that even if a new malicious version is published, your builds will fail because the hash won’t match.
Insight
You should treat your dependencies with the same rigor as your own code. They are, in effect, part of your application. Never trust a package name alone. Trusting a name is like trusting a person’s identity just because they have the right name tag; you need to check their ID. The hash in a lock file is that ID.
Practical
- Stop manually editing
requirements.txt. Treat it as a build artifact. - Define your direct dependencies in a file like
requirements.in. - Use a tool to “compile” or “lock” that file into a fully-pinned
requirements.txtthat includes hashes. - Commit both files to your repository. When installing, use the hash-checked
requirements.txtfile.
Tools/Techniques
- Dependency Locking:
- pip-tools: A popular tool that uses
pip-compileto create a fully-pinnedrequirements.txtwith hashes from arequirements.infile. Install withpip install -r requirements.txt --require-hashes. - Poetry: A modern Python packaging and dependency management tool that generates a
poetry.lockfile by default. - Vulnerability Scanning:
- pip-audit: Scans your installed dependencies for known vulnerabilities. Run this in your CI/CD pipeline.
- Snyk: A commercial tool that can be integrated into your repository and CI/CD pipeline to detect vulnerable dependencies, including malicious packages.
- Private Repositories:
- JFrog Artifactory or Sonatype Nexus: Act as a proxy between your developers/CI and PyPI. You can configure them to only allow approved, vetted packages.
Metrics/Signal
- CI/CD pipeline build fails if a dependency without a valid hash is introduced.
- A vulnerability scanner (like Snyk or pip-audit) blocks a build due to a newly discovered malicious package.
- 100% of active projects have a
requirements.inand a hash-lockedrequirements.txt(or equivalent likepoetry.lock).
Limit
Explanation
Assume the attacker was successful and the malicious package was installed. The “limit” phase is about containing the blast radius. The damage the post-install script can do is directly proportional to the privileges of the environment it runs in. If pip install is run by a root user in a CI/CD runner with access to production secrets, the damage is catastrophic. If it’s run inside a minimal, short-lived container with no secrets and restricted network access, the damage is negligible.
Insight
The principle of least privilege is your most powerful tool for damage control. Your build environment should be considered untrusted. Code from the internet (your dependencies) is being executed there. It should have the absolute minimum access required to perform its task and nothing more.
Practical
- CI/CD: Use short-lived, narrowly-scoped credentials. Instead of storing a static
AWS_ACCESS_KEY_ID, use OpenID Connect (OIDC) to issue temporary credentials to your CI/CD job that are only valid for its execution and restricted to specific resources. - Containers: Run your build steps inside ephemeral containers. Do not mount sensitive directories like
~/.sshinto the build container. - Network: Configure egress firewall rules for your build environment. If your build doesn’t need to connect to anything other than PyPI and your own artifact repository, block all other outbound traffic on port 443. This would prevent the exfiltration script from phoning home.
Tools/Techniques
- Short-Lived Credentials:
- GitHub Actions OIDC integration for cloud providers.
- HashiCorp Vault for dynamically generating secrets.
- Network Controls:
- Kubernetes Network Policies to control egress from build pods.
- AWS VPC Security Groups with restrictive outbound rules for your EC2-based CI runners.
Metrics/Signal
- Number of long-lived static credentials used in CI/CD environments (aim for zero).
- Firewall logs showing blocked outbound network connections from build agents to non-allowlisted destinations.
- Security audit reports showing that CI/CD roles have the minimum necessary permissions.
Expose
Explanation
Expose is about detection. How would you know the attack happened? The malicious setup.py script makes a network connection to send the stolen data. This is a high-fidelity signal. A build process that is simply installing dependencies should not be making arbitrary outbound network requests to unknown domains. Monitoring process execution and network traffic from your build environment is key to detecting this anomalous activity.
Insight
The attack is loudest at the moment of execution. The installation of a dependency is a well-defined process. Any deviation, like spawning a shell or making a network call to a strange IP, is a strong indicator of compromise. You can’t detect what you don’t monitor.
Practical
- Log all DNS queries and outbound network connections from your CI/CD runners.
- Set up alerts for network connections from build processes (like
pip) to any destination other than your approved package repositories (e.g., pypi.org, files.pythonhosted.org, or your private Artifactory/Nexus server). - Use security tooling that specifically analyzes package behavior during installation.
Tools/Techniques
- Network Monitoring:
- Zeek: An open-source network security monitor that can log all DNS, HTTP, and other traffic for later analysis.
- AWS VPC Flow Logs: Log all IP traffic from your CI runners and feed it into a monitoring tool for alerting.
- Behavioral Analysis:
- Socket.dev: A commercial service that proactively analyzes package dependencies and can detect suspicious behavior like network usage or shell execution in post-install scripts.
- Endpoint Detection and Response (EDR): Tools like CrowdStrike can be installed on build runners to monitor for malicious process behavior.
Metrics/Signal
- An alert firing for an outbound connection from a
pip installcommand to an IP address not on your allowlist. - An unexpected DNS query for a domain that is not a known package index.
- A log from a tool like Socket.dev indicating that a newly added dependency uses the network during installation.
eXercise
Explanation
To build muscle memory and test your defenses, you need to practice. A security exercise involves simulating the attack in a safe and controlled manner to see if your Fortify, Limit, and Expose controls work as expected. This isn’t a “pass/fail” test; it’s a learning opportunity to find the gaps in your tooling, processes, and team knowledge before a real attacker does.
Insight
Security policies and tools are useless if they don’t work in practice or if the team doesn’t know how to use them. Running a drill turns abstract security concepts into concrete, memorable experiences for developers and helps justify investments in better tooling and training.
Practical
- Create a Safe Malicious Package: Create a simple Python package. In its
setup.py, add a script that does something benign but detectable, like making a DNS query formy-test-exfil.your-company-domain.comor an HTTP request to an internal server you control. - Host it Privately: Upload this package to a private repository that your CI/CD system can access.
- Simulate the Typo: In a test branch of a sample application, add the “malicious” package to the
requirements.txtfile. - Run the Pipeline: Push the branch and let your CI/CD pipeline run its normal build process.
- Observe:
- Fortify Test: Did your hash-checking
pip installcommand fail because the new package wasn’t in the lock file? (It should). - Limit/Expose Test: If you bypass the lock file check for the test, does your network monitoring system fire an alert for the DNS query or HTTP call? Do your firewall rules block it?
- Fortify Test: Did your hash-checking
Tools/Techniques
- Private Repository: Use a tool like pypiserver for a simple, temporary private index for the exercise.
- Request Sink: Use a service like requestbin.com or an internal web server with logging to act as the “attacker’s server” to confirm the exfiltration request was made.
- Post-Mortem Meeting: After the exercise, gather the team to discuss what happened. Did the alerts go to the right people? Was the information clear? What could be improved?
Metrics/Signal
- Time-To-Detect (TTD): How long did it take from the “malicious” code execution to an alert being generated?
- Successful block of the simulated exfiltration call by network policies.
- An “action items” list generated from the post-mortem, with owners and deadlines for improving processes or tools.