The Quest for the Perfect Lip-Synching Robot

Their mouths may move convincingly, but they’re far from lifelike—for now

By Molly Glick

January 14, 2026

Unravel the biggest ideas in science today. Become a more curious you.

The full Nautilus archive • eBooks & Special Editions • Ad-free reading

The full Nautilus archive
eBooks & Special Editions
Ad-free reading

Subscribe to Nautilus

Explore

If we’re ever going to effectively communicate with robots, they’ve got to get better at lip-syncing.

Nautilus Members enjoy an ad-free experience. Log in or Join now .

Intricate mouth movements are vital for human connection, especially in loud environments—in such settings, we gaze at a speaker’s mouth up to half the time.

This means that they’re a key feature for robots we can comfortably chat with, but researchers have long struggled to create robots with lips that can skillfully synchronize with audio. Bots have mechanical constraints that limit the range of motion and speed of lip movements, for example, and they tend to lag after commands.

Nautilus Members enjoy an ad-free experience. Log in or Join now .

To overcome this hurdle, researchers from Columbia University in New York harnessed artificial intelligence models inspired by the human brain, known as neural networks, enabling the humanoid bot to make smooth mouth motions that sync up with a mix of words.

KARAOKE ROBOT: The lip-syncing tool outperformed other systems designed for similar uses, and even allowed for realistic chats in multiple languages. Video by Yuhang Hu.

“The capability to form complex lip shapes … enhances overall more detailed speech synchronization, providing more lifelike interactions that mitigate some of the risks of the uncanny valley effect,” according to a new Science Robotics paper.

Nautilus Members enjoy an ad-free experience. Log in or Join now .

The team designed a human-like robot face with “skin” made of soft silicone. It has magnetic connectors that allow for 10 degrees of freedom, making all sorts of lip movements possible.

To train the models powering this bot, the team fed them recordings of their robot making various lip movements, like those associated with rounded vowels. Then, they incorporated AI-generated videos of “ideal” lip movements for certain sentences into their models.

The system allows a robot’s lips to form the shapes associated with 24 consonants and 16 vowels, the researchers reported in the paper.

Nautilus Members enjoy an ad-free experience. Log in or Join now .

Read more: “Deepfake Luke Skywalker Should Scare Us”

Using these “ideal” AI videos as a baseline, they compared their new system with existing techniques used to shape robot lip movements. Among all the methods, theirs had the least mismatch compared with mouth movements from the AI videos. The bot was also able to convincingly utter 10 different languages, including Korean, French, and Arabic, with varying phonetic structures, and it even did a bit of karaoke.

There’s still plenty of room for improvement, the researchers acknowledged, including incorporating more training data and adding more physical degrees of freedom. In the future, they think that their tool could be used in education and in caring for older adults experiencing cognitive decline, as it could help us connect with robots “on a human level.”

Nautilus Members enjoy an ad-free experience. Log in or Join now .

But they also caution that heightened emotional connection with robots could “be exploited to gain trust from unsuspecting users, especially children and the elderly,” and that designers should implement safeguards against these risks.

“The ability to create physical machines that are capable of connecting with humans at an emotional level is maturing rapidly,” the authors wrote. “The robots presented here are still far from natural, yet one step closer to crossing the uncanny valley.”