
Malicious Hugging Face model discovered by ReversingLabs researchers
New research from ReversingLabs has uncovered a novel technique for distributing malware on the Hugging Face platform, exploiting the vulnerabilities of Pickle file serialization.
Hugging Face, a popular platform for sharing and collaborating on machine learning (ML) models, has become a target for threat actors seeking to distribute malicious software. ReversingLabs researchers discovered two models containing malicious code that were not flagged as unsafe by Hugging Face’s security scanning mechanisms.
This technique, dubbed “nullifAI,” involves abusing the Pickle file format, a popular Python module used for serializing and deserializing ML model data. While Pickle is easy to use, it is considered an unsafe format as it allows Python code to be executed during deserialization.
“During RL research efforts, the team came upon two Hugging Face models containing malicious code that were not flagged as ‘unsafe’ by Hugging Face’s security scanning mechanisms,” the report states.
The malicious models discovered by ReversingLabs were stored in PyTorch format, which is essentially a compressed Pickle file. The attackers bypassed Hugging Face’s security tool, Picklescan, by compressing the files using the 7z format, which prevented them from being scanned properly.
“The two models RL detected are stored in PyTorch format, which is basically a compressed Pickle file,” the report explains. “By default, PyTorch uses the ZIP format for compression, and these two models are compressed using the 7z format, which prevents them from being loaded using PyTorch’s default function, torch.load().”
Upon further analysis, ReversingLabs found that the malicious Pickle files contained a reverse shell payload, allowing remote access to the infected machine.
“The malicious payload was a typical platform-aware reverse shell that connects to a hardcoded IP address,” the report explains.
Even more concerning is the fact that Pickle security scanning tools failed to detect the presence of the malicious functions due to the way they interpret serialized data.
ReversingLabs has reported their findings to Hugging Face, and the platform is taking steps to address the issue. Developers and organizations must remain vigilant and adopt robust security practices to mitigate these risks.