packj: detect malicious/risky open-source software packages
Packj flags malicious/risky open-source packages
Packj (pronounced package) is a command-line (CLI) tool to vet open-source software packages for “risky” attributes that make them vulnerable to supply chain attacks. This is the tool behind our large-scale security analysis platform Packj.dev which continuously vets packages and provides free reports.
How it works
- It first downloads the metadata from the registry using their APIs and analyzes it for “risky” attributes.
- To perform API analysis, the package is downloaded from the registry using their APIs into a temp dir. Then, packj performs a static code analysis to detect API usage. API analysis is based on MalOSS, a research project from our group at Georgia Tech.
- Vulnerabilities (CVEs) are checked by pulling info from the OSV database at OSV
- Python PyPI and NPM package downloads are fetched from pypistats and npmjs
- All risks detected are aggregated and reported
Risky attributes
The design of Packj is guided by our study of 651 malware samples of documented open-source software supply chain attacks. Specifically, we have empirically identified a number of risky code and metadata attributes that make a package vulnerable to supply chain attacks.
For instance, we flag inactive or unmaintained packages that no longer receive security fixes. Inspired by Android app runtime permissions, Packj uses a permission-based security model to offer control and code transparency to developers. Packages that invoke sensitive operating system functionality such as file accesses and remote network communication are flagged as risky as this functionality could leak sensitive data.
Some of the attributes we vet for include
Attribute | Type | Description | Reason |
---|---|---|---|
Release date | Metadata | Version release date to flag old or abandoned packages | Old or unmaintained packages do not receive security fixes |
OS or lang APIs | Code | Use of sensitive APIs, such as exec and eval | Malware uses APIs from the operating system or language runtime to perform sensitive operations (e.g., read SSH keys) |
Contributors’ email | Metadata | Email addresses of the contributors | Incorrect or invalid email addresses suggest a lack of 2FA |
Source repo | Metadata | Presence and validity of public source repo | The absence of a public repo means no easy way to audit or review the source code publicly |
A full list of the attributes we track can be viewed at threats.csv
These attributes have been identified as risky by several other researchers [1, 2, 3] as well.
How to customize
Packj has been developed with a goal to assist developers in identifying and reviewing potential supply chain risks in packages.
However, since the degree of perceived security risk from an untrusted package depends on the specific security requirements, Packj can be customized according to your threat model. For instance, a package with no 2FA may be perceived to pose greater security risks to some developers, compared to others who may be more willing to use such packages for the functionality offered. Given the volatile nature of the problem, providing customized and granular risk measurement is one of our goals.
Packj can be customized to minimize noise and reduce alert fatigue by simply commenting out unwanted attributes in threats.csv