Can Reinforcement Learning help Robots become Intelligent?
Understanding how Robots learn by themselves to become cognitively capable
We know that robots today can accomplish a multitude of tasks like assembling parts, picking farm produce, doing a quick scan of surroundings, and greeting people at malls. But can they learn by themselves like primates? Scientists argue that since robotics is slowly approaching its peak stage, it will be hugely beneficial and exciting if the robots could learn on their own, from interactions with the physical and social environment. While AI and machine learning are doing their part in augmenting robotics, implementation is not simple as most robots have a limited learning capacity. Through reinforcement learning (RL) is purported to be the simplest way to train robots, much work needs to be done.
Reinforcement Learning is a type of machine learning technique that enables an agent to learn in an interactive environment by trial and error using feedback from its own actions and experiences. It differs from supervised learning in a way that supervised learning involves providing feedback to the agent in the form of a correct set of actions for performing a task. In contrast, RL uses rewards and punishments as signals for positive and negative behavior. In robotics, RL can be vital in enabling robots to create an efficient adaptive control system for itself where it learns from its own experience and behavior.
Currently, roboticists are developing automated robots that can learn new tasks solely by observing humans and animals via imitation learning. This is another term for reinforcement learning or the challenge of getting a robot to act in the world to maximize its rewards. Imitation learning has become essential in robotics, in which characteristics of mobility in settings like construction, agriculture, search and rescue, military, and others make it challenging to program robotic solutions manually.
A couple of years ago, Tomoaki Nakamura of UEC, Tokyo, and colleagues proposed an algorithm that enables robots to learn motions by observing human motion. Robots obtained multimodal information from objects and linguistic information by communicating with others.
Recently, Andrew Hundt, a Ph.D. candidate Roboticist, Computational Interaction, and Robotics Lab, Johns Hopkins University, wrote a paper that explores the potential of learning through positive reinforcement. While this may not be totally a machine learning-based algorithm, it draws inspiration from Andrew’s pet dog, who was trained using positive conditioning. He taught a robot named SPOT (Schedule for POsitive Task) how to teach itself several new tricks, including stacking blocks, in two days as compared to what typically takes a month. He and his team dramatically improved the robot’s skills using this method and did it quickly enough to make training robots for real-world work a more feasible enterprise. The findings are newly published in a paper called “Good Robot!” Other team members were Johns Hopkins graduate students Benjamin Killeen, Nicholas Greene, Heeyeon Kwon, and Hongtao Wu; former graduate student Chris Paxton; and Gregory D. Hager, a professor of computer science.
To stack blocks, Spot the robot needed to learn how to focus on constructive actions. As the robot explored the blocks, it quickly learned that correct behavior for stacking earned high points, but incorrect ones earned nothing. Reach out, but don’t grasp a block? No points. Successfully grasp a block? Score up. Place it on another block? One more Point. Knock over a stack? Definitely no points. Spot earned the most by placing the last block on top of a four-block stack.
The training tactic not only worked; it took just days to teach the robot what used to take weeks. The team was able to reduce the practice time by first training a simulated robot, which is a lot like a video game, then running tests with Spot. The team believed these findings could help train household robots to do laundry and wash dishes—tasks that could be popular on the open market and help seniors live independently. It could also help design improved self-driving cars.