A critical vulnerability has been discovered in the unstructured library, a powerhouse tool used by developers to prep data for Large Language Models (LLMs). With over 4 million monthly downloads, the unstructured library is a critical link in the AI supply chain. The flaw, tracked as CVE-2025-64712, carries a devastating CVSS score of 9.8, allowing attackers to write arbitrary files to the host filesystem simply by uploading a malicious .msg file.
The vulnerability turns a routine data processing task into a potential system takeover. For the thousands of organizations using this open-source library to ingest emails and documents, the risk is immediate and severe.
The vulnerability resides in the partition_msg function, which is responsible for breaking down Outlook email files (.msg) into usable data. The flaw is a classic Path Traversal issue.
According to the advisory, “An attacker can craft a malicious .msg file with attachment filenames containing path traversal sequences (e.g., ../../../etc/cron.d/malicious)”.
When the library processes this file with the process_attachments=True setting enabled, it blindly trusts the filenames provided in the attachment. This allows the attacker to break out of the intended upload directory and write their file anywhere on the system they choose.
The impact of this vulnerability extends far beyond just messing up a few files. By controlling where files are written, an attacker can achieve Remote Code Execution (RCE).
The advisory highlights several potential attack vectors:
- Overwriting Configuration Files: An attacker could replace critical system configs to gain access or disable security.
- Hijacking Cron Jobs: By writing a malicious script to a directory like /etc/cron.d/, the attacker can force the system to execute their code automatically.
- Python Package Poisoning: Overwriting Python libraries or files could allow the attacker to execute code the next time the application runs.
The vulnerability affects all versions of the unstructured library up to and including 0.18.17. The maintainers have released a patch in version 0.18.18, which sanitizes attachment filenames to prevent traversal attacks.
For teams that cannot upgrade immediately, the report suggests a temporary workaround: “Set process_attachments=False when processing untrusted MSG files”. This effectively disables the vulnerable feature, closing the attack vector until a proper patch can be applied.
Related Posts:
- GitHub Security Alerts has detected over 4 million vulnerabilities
- Malicious npm Packages Exploiting Typosquatting to Inject SSH Backdoors
- Popular ‘is’ JavaScript Library & Others Compromised in npm Supply Chain Attack
Support Our Threat Intelligence
If you find our CVE report and cybersecurity news helpful, consider supporting our work.