Beware! Hackers Sneak Malicious AI Models onto HuggingFace with 'Corrupted' Pickle Tricks!
Hackers upload malicious AI models on HuggingFace with corrupted pickle files to trap users.

In a recent development, cybersecurity researchers uncovered two malicious machine learning models quietly uploaded to the popular matching learning platform, Hugging Face. These models used a novel technique to bypass security detection through "corrupted" pickle files, raising concerns.
Maliciously Entered Code
Karlo Zanki, a researcher at ReversingLabs, explained that the beginning of the pickle files extracted from these PyTorch format archives suggests the presence of malicious Python code. He further explained that the malicious code mainly consists of a reverse shell that can connect to a hard-coded IP address, enabling remote control by hackers. This attack method using pickle files is known as 'nullifAI,' aiming to evade existing security measures, leaving users open to malicious attacks.
Specifically, the two malicious models found on Hugging Face are glockr1/ballr7 and who-r-u0000/0000000000000000000000000000000000000.
These models serve) are mostly proof of concept rather than actual supply chain attack cases. While the pickle format is popular in the distribution of machine learning models, it can pose security risks, because the format allows arbitrary code execution during loading and deserialization.
Researchers also found that the two models used compressed pickle files in PyTorch format, applying a compression method of 7z, which is different from the default ZIP format. With this feature, the hackers evaded the detection of Hugging Face's Picklescan tool. Zanki further noted that although deserialization of the pickle files may fail due to the insertion of malicious payloads, there's still a chance it can deserialize, thus executing the malicious code.
More Problems
Things get a little more complicated from there as the malicious code is located at the beginning of the pickle stream, and Hugging Face's security scanning tools fail to identify the potential risks of the models. This incident has sparked widespread concern about the security of machine learning models. However, researchers have responded to the issue by creating fixes and updating the Picklescan tool to prevent similar incidents from occurring again.
The incident is a reminder to tech enthusiasts not to downplay cybersecurity issues, especially against the backdrop of rapid advancements in AI and machine learning. Now, it is increasingly critical to protect user and platform security.