Advanced AI Models Being Exploited for Malware Development and Model Theft

Researchers in cybersecurity have revealed that large language models (LLMs) can be leveraged to create variants of malicious JavaScript code on a large scale, significantly improving their ability to bypass detection mechanisms.

“While LLMs are not inherently designed to generate malware, they can be used to modify or obfuscate existing malicious code, making it more difficult to identify,” explained analysts from Palo Alto Networks’ Unit 42 in their recent findings. “These models enable criminals to create more natural-looking transformations, complicating detection efforts.”

This approach allows repeated transformations of malicious code, which can ultimately degrade the accuracy of malware classification systems. Over time, such changes can trick these systems into misclassifying harmful code as harmless.

Despite LLM providers implementing strict security measures to prevent misuse, cybercriminals have promoted tools like WormGPT, which automate the creation of phishing emails tailored to specific victims and assist in generating unique malware.

In October 2024, OpenAI reported that it had blocked more than 20 campaigns and deceptive networks attempting to misuse its platform for tasks like reconnaissance, vulnerability exploitation, scripting, and debugging.

Unit 42 demonstrated the power of LLMs by reworking malware samples multiple times to evade detection by machine learning (ML) models such as Innocent Until Proven Guilty (IUPG) and PhishingJS. This method enabled the creation of 10,000 distinct JavaScript variants while preserving the original functionality of the code.

Obfuscation Techniques in Play

The adversarial ML strategy relies on techniques such as renaming variables, splitting strings, adding junk code, removing unnecessary whitespace, and re-implementing code. Each time the malware is processed, it results in a new, less-detectable version of the original script.

“The final output is a modified JavaScript file that retains the behavior of the original malware but often scores much lower on maliciousness scales,” Unit 42 noted. They reported that their method could alter the verdict of their malware classifier from malicious to benign 88% of the time.

Even more concerning, these rewritten JavaScript variants were found to evade detection by malware scanning platforms like VirusTotal. Moreover, LLM-based obfuscation produces transformations that appear more natural compared to those generated by traditional tools like obfuscator.io, which are easier to identify due to their distinct patterns of code alteration.

“Generative AI has the potential to significantly expand the scale of malicious code variants,” the researchers added. “That said, we can also use the same AI-driven techniques to improve ML models by generating diverse training data.”

Model Theft via Side-Channel Attacks

Meanwhile, researchers from North Carolina State University have unveiled a side-channel attack, TPUXtract, capable of extracting critical details about machine learning models running on Google Edge Tensor Processing Units (TPUs) with a success rate of 99.91%. This method could be exploited to steal intellectual property or conduct further cyberattacks.

The study demonstrated a hyperparameter-stealing attack that reveals intricate details of a neural network, including layer types, node counts, kernel sizes, and activation functions. The researchers emphasized that this is the first comprehensive attack of its kind to extract previously unknown model architectures.

The attack leverages electromagnetic signals emitted by TPUs during neural network inferences to infer model hyperparameters. However, this technique requires the attacker to have physical access to the target device and access to expensive equipment for signal capture and analysis.

“By reconstructing the architecture and layer configurations, we managed to replicate the AI’s functionality or produce a highly accurate surrogate,” explained Aydin Aysu, a co-author of the study.

As LLMs and advanced AI systems evolve, their dual-use nature continues to pose a challenge, necessitating robust measures to ensure their secure and ethical use.