This site is a "Work In Progress" 45%

The Digital Warehouse. Security in Public Binary Repositories

By Steve Poole

1. What Makes a Good Public Repository?

If you’re building software, you already rely on public binary repositories : whether you admit it or not. They’re the warehouses your build system raids every day to pull down jars, wheels, tarballs, or containers. So before we rank and compare them, let’s set some ground rules for what “good” looks like.

1.1 The Non-Negotiables (Required)

1.2 Nice-to-Haves (Optional but Good)

1.3 The Red Flags (Bad Practices)


2. The Repo Landscape

Binary repos aren’t all equal. Some are decades-old infrastructure, some are convenience wrappers, some are corporate tie-ins. Here’s a map of the major players:

See Appendix A for the full directory.


3. Comparing Against the Criteria

Let’s rate the big ones against our “non-negotiables” and “red flags.”


4. Security Features and the Reality Check

4.1 Features That Work

4.2 Features Still in Dispute

4. Attack Scenarios: When Repo Weaknesses Become Exploits

So why do all these “bad practices” matter? Because they map directly to attack playbooks we’ve seen in the wild. Here are the big ones:

4.1 Install-Time Execution

Some repos allow packages to run arbitrary code at install time.

The takeaway: if the ecosystem allows “run on install,” then a single compromised package = code execution inside every downstream dev and CI/CD environment.

4.2 Typosquatting

First-come naming systems (npm, PyPI, Docker tags) are magnets for typo tricks:

Without namespace protection, users fat-finger commands and pull malware.

4.3 Dependency Confusion

If your private repo and public repo share names, you’re fair game:

This attack exploits default resolution rules — and mutable version handling makes it worse.

4.4 Mutable Tags

Containers suffer especially here:

Mutable pointers are basically a supply chain backdoor.

5. The Checklist: What a Good Repo Should Do

Required

Optional but Valuable

Avoid

4.1 Install-Time Execution

Some repos allow packages to run arbitrary code at install time.

Case study: npm “event-stream” (2018) — A trusted maintainer handed over control of the event-stream package (2M downloads/week). The new maintainer slipped in a dependency that exfiltrated Bitcoin wallets from apps using the package. It passed through because lifecycle scripts are “normal” in npm land. Lesson: if install-time execution is allowed, one compromised package = remote code execution in every downstream dev and CI/CD environment.

4.2 Typosquatting

First-come naming systems (npm, PyPI, Docker Hub) are magnets for typo tricks:

Case study: PyPI typosquat flood (2024) — Attackers automated uploads of hundreds of typo-packages, each with malicious setup.py. The scale was so bad PyPI temporarily froze new registrations to cope. Lesson: without namespace protection, the repo becomes a typo-driven malware distribution system.

4.3 Dependency Confusion

When your private and public repos share names, attackers can step in.

Case study: Alex Birsan’s dependency confusion experiment (2021) — By uploading benign packages with the same names as internal ones used by Apple, Microsoft, and Tesla, Birsan got code execution inside their networks. He exfiltrated machine info via DNS as proof. The vector? Package managers happily chose his higher version number. Lesson: unless build tools are locked to internal sources, dependency confusion is almost trivial.

4.4 Mutable Tags

Containers suffer especially here:

Case study: npm “Shai-Hulud” campaign (2025) — Started with one maintainer’s compromised account. The malware harvested GitHub tokens and npm credentials, then automatically republished infected versions across the maintainer’s other projects. Tags like latest spread the infection downstream instantly. Lesson: mutable pointers (tags or dist-tags) are effectively backdoors. Digests, lockfiles, and provenance checks are the only defense.

4.5 Notable Incidents Cheat Sheet

Incident Repo / Ecosystem Vector What Happened Impact
event-stream compromise (2018) npm Maintainer handover + install-time execution New maintainer added a malicious dependency that stole Bitcoin wallets. Millions of weekly downloads affected.
ctx hijack (2022) PyPI Account takeover (domain resurrection) Attacker re-registered an expired maintainer email domain, reset password, and uploaded malware. Stole AWS creds & env vars from victims.
PyPI typosquat flood (2024) PyPI Typosquatting + malicious setup.py >500 typo-packages uploaded in days; each ran infostealer on install. Forced PyPI to freeze new registrations.
torchtriton confusion (2022) PyPI (PyTorch nightly) Dependency confusion Malicious package with same name as private one was fetched. Exfiltrated /etc/passwd & SSH keys.
qix maintainer phish (2025) npm Maintainer account takeover (phishing) Popular maintainer tricked into giving creds; 18 packages trojanized. Malicious updates hijacked crypto wallets.
Shai-Hulud worm (2025) npm Account takeover + self-propagation Malware harvested creds, exfiltrated via GitHub repos, re-published infected versions. >180 npm packages compromised; worm-like spread across ecosystem.

4.6 Incident Mitigations Cheat Sheet

Incident Repo / Ecosystem Vector Mitigation(s) That Would Help
event-stream compromise (2018) npm Maintainer handover + install-time execution Disable install-time scripts by default; stronger maintainer vetting; provenance attestation for new maintainers.
ctx hijack (2022) PyPI Account takeover (domain resurrection) Automated monitoring of maintainer email domains (PyPI now does this); mandatory 2FA; OIDC trusted publishing.
PyPI typosquat flood (2024) PyPI Typosquatting + malicious setup.py Namespace protection (like Maven Central’s domain model or npm scopes); automated typosquat detection; disallow code execution at install.
torchtriton confusion (2022) PyPI (PyTorch nightly) Dependency confusion Build tool config to prioritize private repos; reserved namespaces; private repo firewall blocking unexpected public lookups.
qix maintainer phish (2025) npm Maintainer account takeover (phishing) Mandatory FIDO2/WebAuthn keys; OIDC publishing (no long-lived tokens); stronger phishing education & repo-origin validation.
Shai-Hulud worm (2025) npm Account takeover + self-propagation Same as above (strong auth + OIDC); automated anomaly detection (sudden mass package updates); lockfiles/digest pinning in consumers.