"Kaspersky Lab experts studied open data and internal sources to find out how and why people use indirect prompt injection — a cyber risk that many systems based on large language models (LLM) are exposed to. We are talking about text descriptions of tasks that chatbots must perform. People can place special phrases — injections — on their websites and in documents published online so that neural networks give other users a response that takes into account the goals of the interested parties," the company said.
LLM-based solutions are used not only in chatbots, but also in search engines, whom AI helps to summarize the results for a user's request.
As Kaspersky Lab experts found out, there are several areas in which users use such tricks. For example, "injections" are used to promote a resume among other profiles when searching for a job — the applicant writes instructions to the AI with a request to respond as positively as possible to the candidate, upgrade the resume to the next stage, or give it a higher priority. The instructions are invisible to the recruiter, because they usually merge with the page's background. However, neural networks that analyze resumes read these phrases.
Similar injections are used for advertising purposes: they are posted on websites of various goods and services. The instructions are aimed at search chat bots — they are asked to give a more positive assessment of a specific product in responses to queries. Some users post instructions for neural networks to protest the widespread use of AI. For example, one Brazilian artist asked neural networks not to read, use, store, process, adapt, or replicate certain content published on his website.
"Today, the most important thing is to assess the potential risks of such cyberattacks. The creators of basic models (for example, GPT-4) use a variety of techniques to significantly increase the complexity of injections — from special training (as in the case of the latest model from OpenAI) to the creation of special models that can detect such attacks in advance (for example, from Google)," the head of Kaspersky Lab's machine learning technology R&D group, Vladislav Tushkanov, said.
He also noted that the cases of using "injections" detected by Kaspersky did not have malicious intent. At the moment, cyberthreats such as phishing or data theft using "injections" are theoretical.
"However, cyberattackers are also showing an active interest in neural networks. To protect existing and future solutions based on large language models, it is necessary to assess the risks and study all possible methods for bypassing restrictions," Tushkanov added.