The Pentagon Is Bolstering Its AI Systems—by Hacking Itself

A new “red team” will try to anticipate and thwart attacks on machine learning programs.

THE PENTAGON SEES artificial intelligence as a way to outfox, outmaneuver, and dominate future adversaries. But the brittle nature of AI means that without due care, the technology could perhaps hand enemies a new way to attack.

The Joint Artificial Intelligence Center, created by the Pentagon to help the US military make use of AI, recently formed a unit to collect, vet, and distribute open source and industry machine learning models to groups across the Department of Defense. Part of that effort points to a key challenge with using AI for military ends. A machine learning “red team,” known as the Test and Evaluation Group, will probe pretrained models for weaknesses. Another cybersecurity team examines AI code and data for hidden vulnerabilities.

Machine learning, the technique behind modern AI, represents a fundamentally different, often more powerful, way to write computer code. Instead of writing rules for a machine to follow, machine learning generates its own rules by learning from data. The trouble is, this learning process, along with artifacts or errors in the training data, can cause AI models to behave in strange or unpredictable ways.

“For some applications, machine learning software is just a bajillion times better than traditional software,” says Gregory Allen, director of strategy and policy at the JAIC. But, he adds, machine learning “also breaks in different ways than traditional software.”

A machine learning algorithm trained to recognize certain vehicles in satellite images, for example, might also learn to associate the vehicle with a certain color of the surrounding scenery. An adversary could potentially fool the AI by changing the scenery around its vehicles. With access to the training data, the adversary also might be able to plant images, such as a particular symbol, that would confuse the algorithm.

Allen says the Pentagon follows strict rules concerning the reliability and security of the software it uses. He says the approach can be extended to AI and machine learning, and notes that the JAIC is working to update the DoD’s standards around software to include issues around machine learning.

“We don’t know how to make systems that are perfectly resistant to adversarial attacks.”

TOM GOLDSTEIN, ASSOCIATE PROFESSOR, COMPUTER SCIENCE, UNIVERSITY OF MARYLAND

AI is transforming the way some businesses operate because it can be an efficient and powerful way to automate tasks and processes. Instead of writing an algorithm to predict which products a customer will buy, for instance, a company can have an AI algorithm look at thousands or millions of previous sales and devise its own model for predicting who will buy what.

The US and other militaries see similar advantages, and are rushing to use AI to improve logistics, intelligence gathering, mission planning, and weapons technology. China’s growing technological capability has stoked a sense of urgency within the Pentagon about adopting AI. Allen says the DoD is moving “in a responsible way that prioritizes safety and reliability.”

Researchers are developing ever-more creative ways to hack, subvert, or break AI systems in the wild. In October 2020, researchers in Israel showed how carefully tweaked images can confuse the AI algorithms that let a Tesla interpret the road ahead. This kind of “adversarial attack” involves tweaking the input to a machine learning algorithm to find small changes that cause big errors.

Dawn Song, a professor at UC Berkeley who has conducted similar experiments on Tesla’s sensors and other AI systems, says attacks on machine learning algorithms are already an issue in areas such as fraud detection. Some companies offer tools to test the AI systems used in finance. “Naturally there is an attacker who wants to evade the system,” she says. “I think we’ll see more of these types of issues.”

A simple example of a machine learning attack involved Tay, Microsoft’s scandalous chatbot-gone wrong, which debuted in 2016. The bot used an algorithm that learned how to respond to new queries by examining previous conversations; Redditors quickly realized they could exploit this to get Tay to spew hateful messages.

Tom Goldstein, an associate professor at the University of Maryland who studies the brittleness of machine learning algorithms, says there are many ways to attack AI systems, including modifying the data an algorithm is fed in order to make it behave in a particular way. He says machine learning models differ from conventional software because gaining access to a model can allow an adversary to devise an attack, such as a misleading input, that cannot be defended against.

Via Wired.com