CVE-2024-52338: Critical Security Flaw in Apache Arrow R Package Allows Arbitrary Code Execution
The Apache Software Foundation has addressed a critical security vulnerability (CVE-2024-52338) in the Apache Arrow R package. This vulnerability, impacting versions 4.0.0 through 16.1.0, could allow attackers to execute arbitrary code on systems processing maliciously crafted data files.
Apache Arrow is a universal columnar format and multi-language toolbox for fast data interchange and in-memory analytics. It contains a set of technologies that enable data systems to efficiently store, process, and move data. The R arrow package provides access to many of the features of the Apache Arrow C++ library for R users.
Vulnerability Details:
The flaw stems from insecure deserialization of data in IPC and Parquet readers within the affected versions of the R package. Applications reading Arrow IPC, Feather, or Parquet files from untrusted sources, such as user-supplied input files, are particularly vulnerable.
Impact and Scope:
Exploitation of the CVE-2024-52338 vulnerability could have severe consequences, enabling attackers to compromise systems and potentially gain unauthorized access to sensitive data. It is crucial to note that this vulnerability is specific to the Apache Arrow R package and does not directly impact other Apache Arrow implementations or bindings. However, applications using these other implementations in conjunction with the vulnerable R package remain at risk.
Mitigation:
The Apache Software Foundation urges users to upgrade to version 17.0.0 or later of the Apache Arrow R package immediately. Downstream libraries depending on the affected package should also update their dependencies accordingly.
Workaround for Affected Versions:
For users unable to immediately upgrade, a temporary workaround involves reading untrusted data into a Table and utilizing its internal to_data_frame() method. For example, read_parquet(…, as_data_frame = FALSE)$to_data_frame().