Googles Innovative Table Tennis Robots: A Leap Towards Advanced AI Collaboration

Google DeepMind has implemented a pair of robotic arms to enable endless table tennis matches. Launched in 2022, the project’s objective is to empower two robots to continuously learn from each other through competition.

The game lacks a final score that the robots can achieve to conclude the match, allowing them to compete indefinitely as they strive for improvement with every hit. Currently, the robotic arms cannot compete with skilled human players but have already outperformed beginners. Against intermediate players, the robots stand a roughly even chance.

The initiative aims to develop an advanced universal AI model that could serve as the «brain» for humanoid robots capable of interacting with humans in real-world environments such as factories and homes. Researchers from DeepMind and other institutions hope this learning approach, if scaled, could represent a transformative moment for robotics akin to the advent of ChatGPT for AI. «We are cautiously optimistic and believe that continued research in this direction will foster the creation of more adaptable machines capable of acquiring diverse skills essential for effective and safe operation within our unstructured world,» note DeepMind’s senior engineer Pannag Sanketi and Professor Heni Ben Amor from Arizona State University.

Interestingly, table tennis proves to be an effective way to immerse robots in an unpredictable environment. This sport has been used as a benchmark for robotics research since the 1980s due to its blend of speed, reflexes, and strategy. Players must master a variety of skills, including fine motor control and perceptual abilities. Concurrently, robots must make strategic choices on how to outmaneuver opponents and when to take calculated risks. DeepMind researchers characterize the game as «bounded yet highly dynamic.»

The project commenced by employing reinforcement learning to teach the robotic arms the fundamentals of the sport. Initially, both arms were trained to participate in cooperative rallies without any incentive to score points. Eventually, after some refinements by engineers, the team introduced two autonomous robot agents capable of sustaining prolonged rallies. Once this foundation was established, researchers adjusted the parameters and instructed the robots to aim for victory. This phase quickly overwhelmed the still-novice robots. During rallies, the robotic arms picked up new information and developed tactics, but they sometimes forgot previous moves, resulting in a constant stream of brief exchanges often concluding with one robot landing a winning slam.

Robots exhibited a significant performance boost when tasked with competing against human opponents. At first, humans of varying skill levels maintained better rallies than the robots, which proved to be a crucial factor in improving the robots’ performance.

Over time, both robots evolved, enhancing not only their stability but also their ability to execute more complex plays by balancing defensive and offensive strategies with greater unpredictability. Overall, the robots won 45% of their 29 matches against humans, including a victory rate of 55% against intermediate players.

Researchers emphasize their continuous improvement. Part of this progress has been attributed to a novel approach to AI training. DeepMind employs Google Gemini’s visual perception model to review footage of the robots’ games and provide feedback on how to better score. Videos showcasing «Coach Gemini» illustrate how the robotic manipulator adjusts its gameplay in real-time based on AI instructions like «hit the ball as far right as possible.» DeepMind and other companies believe that competitive interactions between agents could enhance universal AI software in ways that resemble how humans learn to navigate their surroundings. While AI can easily outpace most humans in tasks like basic programming or chess, even the most advanced AI-equipped robots struggle to maintain balance like a child. Tasks that are initially simple for humans—such as tying shoelaces or typing—remain significant challenges for robots. This dilemma, known in robotics circles as the Moravec paradox, continues to be one of the major hurdles in creating robots that can be genuinely helpful in domestic settings.

There are some early indicators that these obstacles may be surmountable. Last year, DeepMind successfully taught a robot to tie shoelaces. This year, Boston Dynamics released a video showing the autonomous Atlas robot correcting mistakes made while loading materials in a simulated manufacturing environment.

Earlier, DeepMind began developing a system that would provide AI agents with an «inner voice» to assist them in learning tasks more effectively, ultimately making them «smarter.»