Researchers at the US Army Research Laboratory (ARL) and the University of Texas at Austin (UT) have developed new techniques for robots or computer programs to learn tasks by interacting with a human instructor.

The study, which will be presented and published at the Association for the Advancement of Artificial Intelligence Conference, sees a human providing real-time feedback in the form of a critique to an agent—a robot or computer.

The concept was first introduced as Training an Agent Manually via Evaluative Reinforcement (TAMER) by Dr. Peter Stone, a professor at UT, along with his formal doctoral student Brad Knox.

The ARL and UT researchers drew upon these foundations to develop a new algorithm called Deep TAMER, which uses deep learning—a form of machine learning algorithms loosely inspired by the brain—to learn tasks by viewing videos with a human trainer.

The human trainer then provides the robot agent with a critique, such as ‘good job’ or ‘bad job’, much in the same way a dog might be taught a trick.

Currently, many artificially intelligent robots are required to interact with their environment for extended periods of time to learn how to optimally perform a task. Mistakes made during this process can prove costly, such as an agent falling off a cliff.

How well do you really know your competitors?

Access the most comprehensive Company Profiles on the market, powered by GlobalData. Save hours of research. Gain competitive edge.

Company Profile – free sample

Thank you!

Your download email will arrive shortly

Not ready to buy yet? Download a free sample

We are confident about the unique quality of our Company Profiles. However, we want you to make the most beneficial decision for your business, so we offer a free sample that you can download by submitting the below form

By GlobalData
Visit our Privacy Policy for more information about our services, how we may use, process and share your personal data, including information of your rights in respect of your personal data and how you can unsubscribe from future marketing communications. Our services are intended for corporate subscribers and you warrant that the email address submitted is your corporate email address.

Feedback from humans can help avoid these potential errors as well as speed up the learning process, according to Army researcher Dr. Garrett Warnell.

“The army of the future will consist of soldiers and autonomous teammates working side-by-side,” said Warnell. “While both humans and autonomous agents can be trained in advance, the team will inevitably be asked to perform tasks, for example, search and rescue or surveillance, in new environments they have not seen before.

“In these situations, humans are remarkably good at generalising their training, but current artificially-intelligent agents are not.”

Researchers demonstrated Deep Tamer’s success in the Atari game Bowling. With 15 minutes of human-provided feedback an agent was able to perform better than its human trainer—a task that has proven difficult for even state-of-the-art methods in artificial intelligence.

The researchers envision Deep TAMER as the first step in a line of research that will see more successful human-autonomy teams in the army, with the ultimate goal of autonomous agents that can quickly and safely learn from their human teammates in a wide range of environments.