The Dangers of AI: OWASP Releases Top 10 Vulnerabilities for LLM Applications

2023-06-23 Penta Security Blog

What are LLM applications?

LLM stands for large language model, a computerized language model that contains an artificial neural network with many parameters, enabling it to study large quantities of unlabeled text and recognize patterns at the word level, all in the form of self-supervised or semi-supervised learning. After learning from millions of lines of text, an LLM gradually gains the capability of understanding natural language and how it is used in context, and the capability of generating language in a creative manner by itself.

Since its emergence in 2018, LLMs have become the key to the development of natural language processing, which is crucial for human-AI interactions. Since then, the conventional approach of training fully supervised models for specific tasks has been replaced by running self-supervised LLMs for general purposes. LLMs are now the primary engine that powers the latest chatbots and generative AI services like ChatGPT and Google’s Bard.

Security risks of LLM applications

Compared to AI models that are based on supervised learning, LLMs have a much more autonomous learning process, resulting in less controllable behaviours. This has raised a lot of concerns with regard to security and safety.

Based on these concerns, the OWASP Foundation has recently released the first version of its OWASP Top 10 vulnerabilities for LLM applications. Similar to the OWASP Top 10 for web application vulnerabilities, this list details potential vulnerabilities in LLMs that are most likely to be exploited, aimed at educating developers, system architects, and organizations on the potential risks of deploying these models.

Here is a list of the top ten vulnerabilities:

LLM01:2023 – Prompt Injections

Although LLMs are built with a set of restrictions and filters to prevent users from manipulating their behaviours, a threat actor could exploit coding flaws and use highly targeted malicious prompts to bypass these filters and manipulate the LLM’s actions. They could also make the model ignore previous instructions, leading to the spread of misleading information to users. To minimize the consequences of prompt injection attacks, users should always fact-check AI-generated results before informing others or making important decisions.

LLM02:2023 – Data Leakage

Data leakage is another major risk of LLM. Although LLMs are designed to filter out sensitive data and confidential information in their responses, incomplete filtering could result in the accidental exposure of sensitive information, confidential details, or proprietary materials in their responses to the user. Several related incidents have occurred over the past year. For instance, in April this year, Samsung Electronics found out that multiple employees used ChatGPT to fix bugs in the company’s proprietary source code, only to be discovered when the code ended up in ChatGPT’s response to another user. This forced Samsung to temporarily ban the usage of generative AI services on all company devices until a complete set of security measures is in place. Besides the leakage of confidential data, plagiarism and copyright infringement are other problems that may arise. To protect sensitive data from unwanted exposure, users of generative AI services should never give out their personal information or confidential data to the application.

LLM03:2023 – Inadequate Sandboxing

LLMs are given a tremendous amount of data for training. However, considering the autonomous nature of LLMs, measures must be taken to limit the LLMs from utilizing the given data to explore their path into confidential and external resources. When these isolation measures are insufficient, a threat actor could use malicious prompts to have the LLM interact with external systems and processes and trick it into revealing sensitive information.

LLM04:2023 – Unauthorized Code Execution

Whereas inadequate sandboxing refers to a lack of isolation between the LLM and confidential resources, unauthorized code execution is when there is a lack of restriction on malicious user inputs. A threat actor could send malicious commands through natural language prompts, triggering unwanted interactions with confidential systems and resources.

LLM05:2023 – SSRF Vulnerabilities

Server-side request forgery (SSRF) refers to vulnerabilities that could enable a threat actor to interact with or gain unauthorized access to the LLM’s internal resources such as APIs and databases. SSRF vulnerabilities result in attacks directly against the internal servers of the LLM, and are usually caused by a lack of resource restrictions or network misconfigurations. Consequently, attackers could interact with the internal servers to initiate unauthorized prompts or extract internal resources.

LLM06:2023 – Overreliance on LLM-Generated Content

As many are already aware, excessive reliance on LLM-generated content can be dangerous to humanity as a whole. Some signs of overreliance include accepting LLM-generated results as facts without verifying their credibility, solely relying on LLM for important decisions, or assuming that LLMs are free of bias. Short-term consequences may include a faster spread of false information, while in the long term, humans could lose their critical thinking capabilities and allow themselves to be freely influenced by AI.

LLM07:2023 – Inadequate AI Alignment

Again, since LLMs are capable of self-supervised learning, it is important to strictly align the goals and objectives of a model to ensure that it is trained for the purpose it is intended to serve. Without adequate alignment, the LLM could deviate from its purpose and result in unintended behaviours. For example, a poorly aligned LLM that is intended to defend a system may end up using the trained “knowledge” to harm the system.

LLM08:2023 – Insufficient Access Controls

Role-based access control is crucial for corporate systems so that users are only granted access to the resources they need based on their roles. The same is true for LLM applications. An LLM should only generate content that is appropriate for the user’s role. Failure to implement role-based access control could result in the user gaining access to confidential or inappropriate resources.

LLM09:2023 – Improper Error Handling

As is the case with any application, error messages and debugging information must be kept secure from unauthorized users. When such information is exposed, threat actors could exploit the error messages to potentially gain sensitive information on the servers and databases to explore new attack vectors.

LLM10:2023 – Training Data Poisoning

This vulnerability begins at the initial stage of LLM training, where a threat actor poisons the training data to undermine the LLM’s effectiveness, security, or in the worst case, ethicality. More advanced attackers could also manipulate the training data to introduce backdoors and vulnerabilities to be exploited at a later stage.

Deploy and manage LLMs responsibility

Overall, organizations deploying LLMs must be constantly aware of these potential risks and vulnerabilities and apply adequate security measures and contingency plans to prevent their LLMs from being exploited or behaving out of control.

Contact Penta Security for more information on web application and API protection.

For more information on security implementation, check out Penta Security’s product lines:

Web Application Firewall: WAPPLES

Database Encryption: D’Amo

Identity and Access Management: iSIGN+

Automotive, Energy, Industrial, and Urban Solutions: Penta IoT Security

Tags:AI cyberattacks