US Army researchers develop human-aided training algorithm for robots

Robert Scammell 5 February 2018 (Last Updated February 5th, 2018 12:45)

Researchers at the US Army Research Laboratory (ARL) and the University of Texas at Austin (UT) have developed new techniques for robots or computer programs to learn tasks by interacting with a human instructor.

US Army researchers develop human-aided training algorithm for robots
Good dog – A human trainer provides the robot agent with a critique, much in the same way a dog might be taught a trick. Credit: Kyle Olson

Researchers at the US Army Research Laboratory (ARL) and the University of Texas at Austin (UT) have developed new techniques for robots or computer programs to learn tasks by interacting with a human instructor.

The study, which will be presented and published at the Association for the Advancement of Artificial Intelligence Conference, sees a human providing real-time feedback in the form of a critique to an agent—a robot or computer.

The concept was first introduced as Training an Agent Manually via Evaluative Reinforcement (TAMER) by Dr. Peter Stone, a professor at UT, along with his formal doctoral student Brad Knox.

The ARL and UT researchers drew upon these foundations to develop a new algorithm called Deep TAMER, which uses deep learning—a form of machine learning algorithms loosely inspired by the brain—to learn tasks by viewing videos with a human trainer.

The human trainer then provides the robot agent with a critique, such as ‘good job’ or ‘bad job’, much in the same way a dog might be taught a trick.

Currently, many artificially intelligent robots are required to interact with their environment for extended periods of time to learn how to optimally perform a task. Mistakes made during this process can prove costly, such as an agent falling off a cliff.

Feedback from humans can help avoid these potential errors as well as speed up the learning process, according to Army researcher Dr. Garrett Warnell.

“The army of the future will consist of soldiers and autonomous teammates working side-by-side,” said Warnell. “While both humans and autonomous agents can be trained in advance, the team will inevitably be asked to perform tasks, for example, search and rescue or surveillance, in new environments they have not seen before.

“In these situations, humans are remarkably good at generalising their training, but current artificially-intelligent agents are not.”

Researchers demonstrated Deep Tamer’s success in the Atari game Bowling. With 15 minutes of human-provided feedback an agent was able to perform better than its human trainer—a task that has proven difficult for even state-of-the-art methods in artificial intelligence.

The researchers envision Deep TAMER as the first step in a line of research that will see more successful human-autonomy teams in the army, with the ultimate goal of autonomous agents that can quickly and safely learn from their human teammates in a wide range of environments.