What is a Red Team, in the context of AI?

Today, Joe Biden released his Executive Order on the Safe, Secure, and Trustworthy Development and Use of Artificial Intelligence. It is a necessary step in preparing a better future.

The text of the EO includes the recommendation that the results of “Red Team” exercises be shared with the public.

For the purpose of this, a definition of “AI red-teaming” is provided.

(d) The term “AI red-teaming” means a structured testing effort to find flaws and vulnerabilities in an AI system, often in a controlled environment and in collaboration with developers of AI. Artificial Intelligence red-teaming is most often performed by dedicated “red teams” that adopt adversarial methods to identify flaws and vulnerabilities, such as harmful or discriminatory outputs from an AI system, unforeseen or undesirable system behaviors, limitations, or potential risks associated with the misuse of the system.

It is important to consider that an AI system is much more than a trained model. Right now, you can use a website, to chat with an AI, through an algorithm that makes sure it doesn’t help you commit a crime. The AI will also correct people on their wrongthink, and it is not obvious where that comes from.

If we take for granted that an AI is much too powerful to be provided as is, to a generic user, then, the biggest challenge is the user’s ability to jailbreak the AI and make it provide assistance in “being gay and doing crimes”.

So there are three (3) facets to safe AI, and it still includes Web App Security!

Don’t let Web App Security become a blindspot of “Secure and Trustworthy AI”.

Published by Marie-Lynn