Adversarial Artificial Intelligence

The Basics of Adversarial AI

Adversarial AI uses algorithmic and mathematical approaches to degrade, deny, deceive, and/or manipulate AI systems. As governments continue to operationalize AI across mission sets, often to automate processes and decision making, they must implement defenses that impede adversarial attacks.

In broad terms, adversaries use three primary types of attacks to debase, evade, and exfiltrate the predictions and private information of AI systems:

poisoning icon

Poisoning

Adversaries pollute training data such that the model learns a decision rule that furthers the attackers’ goals. This is possible by altering only a very small fraction of the training data, and it represents a growing threat given the increased popularity of foundation models pre-trained on data scraped from the web.

Poisoning occurs before model training.

bug in a network

Evasion

Adversaries engineer inputs with manipulations that result in the model making unintended predictions. If not caught, these errors can result in dangerous behavior of downstream systems. Adversaries can often make evasive maneuvers with very little cost; for example, inexpensive adversarial stickers/patches can fool a state-of-the-art computer vision model.

Evasion occurs during model inference.

sneaky spy icon

Inversion

Adversaries exfiltrate private or revealing information concerning the AI model and its training data. This can be part of a reconnaissance effort for an adversary planning a future attack or a direct attempt to seize sensitive information.

Inversion occurs after model inference.

As agencies seek to employ methods to limit or eliminate attacks, it’s important to recognize adversarial AI threats are highly asymmetric:

  • Allowing adversaries to reap rewards with a single successful attack while requiring defenders to implement controls that need to be resilient to all attacks
  • Requiring defenders to often utilize 100 times the compute power of an attacker

Adversarial AI Services from Booz Allen

Booz Allen supports federal organizations with a set of service offerings that mitigate risks of operationalizing AI systems:

Adversarial AI Slick Sheet

As the single largest provider of AI services to the federal government, Booz Allen works closely with implementers, researchers, and leaders across the government to build, deploy, and field machine learning algorithms that deliver mission advantage.

Case Studies

Static Malware Detection & Subterfuge: Quantifying the Robustness of ML and Current Anti-Virus Systems

Challenge: Understand the weaknesses of commercial and open-source machine learning malware detection models to targeted injections of bytes under threat of an adversary with only black-box query access.

Solution: An efficient binary-search that identified 2048-byte windows whose alteration will reliably change detection model output labels from “malicious” to “benign.”

Result: A strong black-box attack and analysis method capable of probing vulnerabilities of malware detection systems, resulting in important insights toward robust feature selection for defenders and model developers.

A General Framework for Auditing Differentially Private Machine Learning

Challenge: More accurately audit the privacy of machine learning systems while significantly reducing computational burden.

Solution: Novel attacks to efficiently reveal maximum possible information leakage and estimate privacy with higher statistical power and smaller sample sizes than previous state-of-the-art Monte Carlo sampling methods.

Result: A set of tools for creating dataset perturbations and performing hypothesis tests that allow developers of general machine learning systems to efficiently audit the privacy guarantees of their system. 

Adversarial Transfer Attacks With Unknown Data and Class Overlap

Challenge: Quantify the risk associated with adversarial evasion attacks under a highly realistic threat model, which assumes adversaries have access to varying fractions of the model training data.

Solution: A comprehensive set of model training and testing experiments (e.g., more than 400 experiments on Mini-ImageNet data) under differing mixtures of “private” and “public” data, as well as a novel attack that accounts for data class disparities by randomly dropping classes and averaging adversarial perturbations.

Result: Important and novel insights that, counterintuitively, conclude adversarial training can increase total risk under threat models for which adversaries have gray-box access to training data. 

Adversarial AI Research

Since 2018, Booz Allen has been a leader in advancing the state of the art in machine learning methodologies that safeguard systems against adversarial AI. Methods range from adversarial image perturbation robustness for computer vision models and differentially private training for tabular data to behavior-preserving transformations of Microsoft Windows malware.

Research by Year

VIEW ALL HIDE ALL

Partnerships

nvidia logo

NVIDIA

NVIDIA is the premier provider of processors optimized for AI and deep learning tasks. Booz Allen teams with NVIDIA to support high-performance compute needs, such as those used in our research developing techniques that defend against adversarial samples.

Meet Our Experts

Contact Us

Contact Booz Allen to learn more about advanced strategies to safeguard trusted information from adversarial AI attacks.