In 2023, residents of 22 homes in New York City agreed to take part in an unusual experiment. For several hours, they performed a variety of common household tasks with an iPhone-outfitted reacher grabber—a tool commonly used for picking up litter and by wheel-chair-bound people to grasp out-of-reach objects (
Fig. 1) [
1], [
2]. As each volunteer conducted simple household activities such as opening and closing doors, picking up bags of trash, turning on toasters, flipping on light switches, and straightening couch cushions, the phones captured video of the grabber’s movements and measured its position and orientation [
1].
The point of the exercise, part of a study performed by Lerrel Pinto, assistant professor of computer science at New York University (NYU) in New York City, NY, USA, and colleagues, was to gather data that would allow a robot to learn the same tasks. The researchers fed the data into a neural network artificial intelligence (AI) system they developed, which, from just a handful of examples, deduced the movements necessary for each of the 109 tasks studied and generated directions so that a robot could replicate them. To test how well their system learned, the scientists took a robot named Stretch, built by the company Hello Robot of Martinez, CA, USA, to ten additional homes. Stretch trundles around on wheels and features an extendable arm with finger-like grippers (
Fig. 2). Across the ten homes, its average success rate was 81% for the 109 tasks, with the success rate for individual tasks indirectly correlated to their complexity [
1].
Pinto and his team are just one of the many groups attempting to harness AI to improve the performance of robots. The field is booming. “There is a lot of fast progress in robot learning. Every few months something new and exciting is happening,” Pinto said. AI has enabled robots to learn how to cook [
3], play soccer [
4], have conversations [
5], and carry out other everyday activities. Along with academic researchers, a slew of startups and established tech companies, such as Google (Mountain View, CA, USA), and electric car maker Tesla (Austin, TX, USA), are working to develop smarter, more adaptable robots that boast useful talents [
6], [
7]. If they are successful, a new generation of learning robots could appear not just in factories and other workplaces, but also in homes, where they could fold laundry, straighten up messy rooms, and take on other chores [
8]. However, none of these robots has gone beyond the demonstration stage, and it could be many years before they become the ubiquitous devices boosters envision.
Researchers are turning to AI because today’s robots are, well, too robotic. Whether they are delivering meals in a restaurant, stacking boxes in a warehouse, or even conducting a landscape survey [
9], they largely do only what they are told, following programmed routines that delineate their actions in particular circumstances [
10]. As a result, robots are adept at specialized jobs in constrained environments, but they struggle with novelty and uncertainty. Faced with unfamiliar situations, “they freak out,” said Shikhar Bahl, who recently completed a doctorate in robotics at Carnegie Mellon University (CMU) in Pittsburgh, PA, USA. The hope is that endowing robots with the capacity for learning will allow them to “operate in unstructured settings and deal with things they were not explicitly programmed to do,” said Bahl, now an AI researcher at CMU.
The learning abilities of some forms of AI have improved rapidly. Large language models (LLMs), which power chatbots like ChatGPT from OpenAI (San Francisco, CA, USA) and Copilot from Microsoft (Redmond, WA, USA), have become much faster and more powerful in just the last several years [
11], [
12]. Visual AI models have also gotten much more proficient at detecting and identifying objects in photos and videos [
13].
Robotics has not kept up, and one reason is the scarcity of data, said Pinto. “AI models are data hungry.” That is no limitation for LLMs, which train on enormous amounts of information from sources such as Wikipedia, news articles, books, and even transcripts of YouTube videos, or for visual models, which can learn from massive image and video libraries on the internet [
14]. But comparably large sources of data to train AI models for robotics are not readily available. Efforts are underway to address the problem. Google’s AI subsidiary DeepMind (London, UK) and 33 groups of academic researchers, for instance, have launched Open X-Embodiment, a freely available repository of robotic data [
15]. The data delineate the movements of 22 kinds of robots, including Stretch, a single manipulating arm, and a dog-like machine, as they perform various actions.
Still, the collection only includes a tiny fraction of the data available to train LLMs, and researchers often must collect their own—say, by asking people to wander around their homes with iPhone-equipped reacher grabbers. The NYU team’s approach has some advantages. The iPhones on the reacher grabbers provide six-dimensional data on the tool, including its movement and rotation [
1], a rich source of information for a robot to learn from. And the data are relatively cheap to collect with the grabbers only costing 25 USD each. But gathering data in this way is somewhat cumbersome.
Another group of researchers turned to a set of around 100 cooking demonstration videos to train their robotic AI. In 2023, the team of Bahl, Russell Mendonca, a CMU doctoral student in robotics, and their adviser, Deepak Pathak, an assistant professor of computer science, debuted a robotic AI system that learns kitchen skills from online videos of people doing common activities like cutting, pouring, kneading, and grating [
16]. Schooling robots with human videos is not straightforward, however, because a person and a robot will not use exactly the same movements to complete a specific task. The group’s AI system learns how to interact with objects by observing human hand motions in the videos and then predicting the consequences of those motions, translating its analysis into directions for a robot to achieve a particular goal. For instance, the AI model deduces from the videos that picking up a knife requires grasping a certain part of the handle, and it can spell out the right movements to allow a robot to grab the handle instead of the blade. Using a test kitchen that contained plastic implements and food, the team showed that training with the human cooking videos greatly improved how well robots performed a set of tasks. The trained robots were able to open a drawer about 90% of the time, for example, versus about 10% of the time for untrained robots [
16].
Researchers are also developing robots that can independently acquire skills through reinforcement learning, much like how people train their dogs [
17]. A dog might receive a treat if it correctly performs a trick. A robot might likewise receive a reward if it does a job well, such as a positive score from a human reviewer or an automated evaluator. The value of reinforcement learning is that it can help robots master skills without being taught [
18].
The killer app for AI-endowed robots could be helping around the house [
8], [
19]. Many researchers in the field perceive a demand for household robots that can take over burdensome chores. Pinto, for one, is betting that this market exists and has launched a company, Fauna Robotics, based in New York City, NY, USA, to develop such robots. People have already bought tens of millions of Roombas, the circular robotic vacuums from iRobot (Beford, MA, USA) that navigate around the home, albeit with increasing help from AI in the more recent models [
20]. But another reason to develop household robots is that the home environment is a good test of a robot’s abilities, Mendonca said. “If you can get a robot to navigate across the cluttered floor of a child’s room and pick up something, it could readily do things in a more structured setting.”
So far, robots that can learn have shown off an impressive variety of skills, but the demonstrations also illustrate their drawbacks. In the videos not posted to the Internet, they drop objects they are carrying, misidentify everyday items, lose their way, and commit other blunders. They are clumsy, slow, and expensive. Their fine motor skills are limited, making it difficult for them to use tools, Bahl said. “We are getting better and better, but we still have a long way to go.” Still, he added, “I am amazed by the progress we have made compared to three-to-four years ago.”
Will that pace of improvement allow smart robots to reach the market soon? Some people think so. Elon Musk, chief executive officer of Tesla, has said the company’s human-like Optimus robot could be available to consumers by late 2025 (
Fig. 3) [
7]. Exactly what Optimus will be able to do is unclear, however, and critics have panned demonstrations of its performance [
21], [
22]
Whether Tesla will meet its goal remains to be seen. But researchers who are working on making smarter, more functional robots are impatient to get them out of the lab and into homes and businesses. “I want to see them being a part of our lives now,” said Pinto.