Cross Fork Object Reference (CFOR): GitHub’s New Security Vulnerability
Experts at Truffle Security have discovered that data from deleted forks, repositories, and even private repositories on GitHub may remain accessible indefinitely. This issue is not only known to the company but is also part of the platform’s architecture.
The potential vulnerability, named Cross Fork Object Reference (CFOR), arises when one repository fork can access confidential data from another fork, including data from deleted and private forks. Similar to the well-known Insecure Direct Object Reference (IDOR) vulnerability, CFOR allows users to use commit hashes to directly access data that would otherwise be unavailable.
Researchers describe the vulnerability using a typical GitHub workflow. A user forks a public repository, makes changes to it, and then deletes the fork. It seems logical that data from the deleted fork should be inaccessible, but in practice, it remains accessible indefinitely, resulting in a loss of control over the information.
The research revealed that data from deleted forks can often be found. In several popular repositories of a major artificial intelligence company, dozens of valid API keys were discovered, encoded in example files, and remaining in forks after deletion.
However, the issue extends beyond the accessibility of data from deleted forks. When a user creates a public repository and later deletes it, the data added after the fork’s creation remains accessible through that fork. This means all commits from the “upstream” repository continue to exist and are accessible through any fork.
Another perilous situation involves private repositories. When a private repository is created, later made public, and has a fork with additional features, data from the private fork may become accessible to the public. This is because changing the visibility of the “upstream” repository splits the repository network into private and public versions, and data is added to the private fork before this change remains accessible.
To access such data, one only needs to know the commit hash. Destructive actions in the GitHub repository network remove links to commit data from the standard interface and git operations, but the data themselves remain accessible if the commit hash is known. Commits can also be found using the GitHub API, further exacerbating data vulnerability.
GitHub does not conceal its architectural decisions and documents them for users. However, many developers, especially novices, may not realize the extent of the problem.
The conclusions from the research are quite alarming. Firstly, there is a highlighted need for rotating access keys to eliminate data leaks. GitHub, like other version control systems, has architectural features that can lead to unintended disclosure of confidential information. It is crucial to raise user awareness about such vulnerabilities and take measures to protect data.
The research indicated that the issue of retaining data from deleted and private repositories is not confined to GitHub alone. Similar vulnerabilities may exist in other version control systems. Developers and companies should be vigilant in protecting their data and regularly check their projects for such vulnerabilities.