packj: detect malicious/risky open-source software packages

Packj flags malicious/risky open-source packages

Packj (pronounced package) is a command-line (CLI) tool to vet open-source software packages for “risky” attributes that make them vulnerable to supply chain attacks. This is the tool behind our large-scale security analysis platform Packj.dev which continuously vets packages and provides free reports.

How it works

  • It first downloads the metadata from the registry using their APIs and analyzes it for “risky” attributes.
  • To perform API analysis, the package is downloaded from the registry using their APIs into a temp dir. Then, packj performs a static code analysis to detect API usage. API analysis is based on MalOSS, a research project from our group at Georgia Tech.
  • Vulnerabilities (CVEs) are checked by pulling info from the OSV database at OSV
  • Python PyPI and NPM package downloads are fetched from pypistats and npmjs
  • All risks detected are aggregated and reported

Risky attributes

The design of Packj is guided by our study of 651 malware samples of documented open-source software supply chain attacks. Specifically, we have empirically identified a number of risky code and metadata attributes that make a package vulnerable to supply chain attacks.

For instance, we flag inactive or unmaintained packages that no longer receive security fixes. Inspired by Android app runtime permissions, Packj uses a permission-based security model to offer control and code transparency to developers. Packages that invoke sensitive operating system functionality such as file accesses and remote network communication are flagged as risky as this functionality could leak sensitive data.

Some of the attributes we vet for include

Attribute Type Description Reason
Release date Metadata Version release date to flag old or abandoned packages Old or unmaintained packages do not receive security fixes
OS or lang APIs Code Use of sensitive APIs, such as exec and eval Malware uses APIs from the operating system or language runtime to perform sensitive operations (e.g., read SSH keys)
Contributors’ email Metadata Email addresses of the contributors Incorrect or invalid email addresses suggest a lack of 2FA
Source repo Metadata Presence and validity of public source repo The absence of a public repo means no easy way to audit or review the source code publicly

A full list of the attributes we track can be viewed at threats.csv

These attributes have been identified as risky by several other researchers [123] as well.

How to customize

Packj has been developed with a goal to assist developers in identifying and reviewing potential supply chain risks in packages.

However, since the degree of perceived security risk from an untrusted package depends on the specific security requirements, Packj can be customized according to your threat model. For instance, a package with no 2FA may be perceived to pose greater security risks to some developers, compared to others who may be more willing to use such packages for the functionality offered. Given the volatile nature of the problem, providing customized and granular risk measurement is one of our goals.

Packj can be customized to minimize noise and reduce alert fatigue by simply commenting out unwanted attributes in threats.csv

Install & Use