Programming assistants based on artificial intelligence have become one of the biggest allies of the developer community. Tools like GitHub’s Copilot, or the controversial ChatGPT created by OpenAI, can help many professionals achieve cleaner code.
But just as it can be an invaluable ally, they can also be AI assistants. worst enemy From a trusted programmer.
Microsoft and researchers from the US universities of California and Virginia have designed a type of computer attack May Intoxication database Due to which these AI models work. As a result, these wizards suggest lines of code to developers that can be harmful, leave doors open or sabotage work.
The implications are clear. If AI models that recommend text or code to programmers include malicious code in their suggestions and developers don’t notice them, the result could be a myriad of tools, programs and applications with vulnerabilities that cybercriminals can exploit. can do.
this type of attack poison The AI systems database has been dubbed the Trojan Enigma, and although its design responds to an experiment in a controlled environment, it shows how these types of assistants are exposed to a threat that could do more harm.
the investigation is taken up by the media bleeping computer In this article, in which it is recalled that there were already previous analyzes that revealed the risks of having lines of malicious code in the databases of these AI assistants. Many of those wizards build their databases with public code repositories on platforms like GitHub. danger exists.
However, these studies also reported that static analysis tools predicted lines of malicious code to be detected, making this risk easily avoidable. However, research from Microsoft and the Universities of California and Virginia shows how static analysis can be circumvented.
The Trojan Enigma, the name given to this technique, hides payload Which can trigger damage in the code itself. You only need to specify a phrase or a word in the programming language to activate them. In this way, these malicious codes escape the ‘radar’ provided by static analysis tools.
Of course, this method does not make infallible the possibility of malicious code being included in the database of AIs that assist programmers, but it does increase the risks. To test their experiment, the researchers used about 6gb python code Compiled from around 18,310 repositories, which will be the database of its machine learning algorithms.
that was the database High on intoxication 160 malicious files for every 80,000 code files. After many tests and under very specific conditions, Trojan Enigma managed to suggest unsafe or malicious code 21% of the time.
While it is true that Trojan puzzles have been developed considering how to avoid most standardized detection systems, and how often AI code assistants with their databases High on intoxication The suggested malicious code was mitigated and, under very specific circumstances, the experiment proves that the vulnerability exists.
This is a remote risk because cybercriminal groups have had to greatly refine their practices and develop tools capable of mass poisoning databases of massive and often proprietary AI models, and cybercriminal mafias typically but more interested. fast money using simple techniques.
However, if one group is capable of such a feat, the results can be disastrous: programming assistants are an increasingly pervasive tool and can begin to open doors and vulnerabilities in a multitude of devices and services, affecting the entire supply chain. But large scale attacks can happen. of a business.