WHAT if you could make President Trump say whatever you wanted? How about listening to the vaguely robot-like voice of yourself, programmed into an app based on a sample of your speech?

The technology will be ready “soon”, according to a team of researchers from the University of Montreal’s institute for computer-based learning algorithms. Now they’re seeking investors for their product, Lyrebird, and hope to join Google in the fast-expanding business of mimicking human voices.

Virtual assistants such as Alexa and Siri have driven the voice technology into the mainstream, where we can control our phones, cars and even refrigerators through verbal commands. And now we face a future where the perfect vocal replication of the president of the United States — or you, or anyone — could be just a few years away, some experts say. How does that future sound?

Whoever wins the development race, experts in technology and ethical fields are gearing up for products that will do to voice what Photoshop did to photos — make reality very difficult to tell from a simulation.

Lyrebird is aware of the downsides. The technology is exciting — with potentially “dangerous consequences such as misleading diplomats, fraud and . . . stealing the identity of someone else”, according to an ethical disclaimer on Lyrebird’s website. The developers did not immediately respond to an interview request.

Nevertheless, the inventors plan to begin selling what they call the first technology “to allow copying voices in a matter of minutes” — with fine tuning for emotional control.

Scientific American notes that Lyrebird and a competing Alphabet-owned project called WaveNet use neural network technology — code patterned after neurons in the human brain — to simulate human speech on the fly. In contrast, existing voice assistants such as Siri and Alexa “work by cobbling together words and phrases from prerecorded files of one particular voice”.

Lyrebird says its technology, once released, will be able to mimic any voice based on as little as a minute of audio recording — though one of the developers told TechCrunch that longer samples would reduce the “distinctly metallic rasp” that the outlet noted in clips released so far.

While Lyrebirds developers have not announced a release date for their product, they claim it will simulate audio much faster than Google’s WaveNet. When the tech giant’s artificial intelligence unit demonstrated WaveNet last year, listeners rated it as the closest simulation yet of human speech, according to the Verge.

Timo Baumann, a speech processing researcher at Carnegie Mellon University, told Scientific American that Lyrebird’s audio sounded a tad robotic but that convincing human simulations — voice assistants that people might treat like friends — were a few years away.

Five major tech giants: Apple, Google, Microsoft, Facebook and Amazon.com are pursing what The Washington Post’s Elizabeth Dwoskin called “an arms race” to create the next generation of virtual assistants to make our personal devices converse like humans, if not also sound like them.

“It’s about taking the way that humans have naturally interacted with each other for thousands of years and applying that to the way they interact with services,” Dag Kittlaus, a co-founder of the Siri app now in every iPhone, told Dwoskin.

The prospect of computer-simulated voice concerned a security technologist from Harvard University, who told Scientific American that a “new reality” of fake audio clips was on the horizon.

“A refined version of this system could replicate a person’s voice with incredible accuracy, making it virtually impossible for a human listener to discern the original from the emulation,” Gizmodo warned. “The day is coming when vocal speech, like an image processed in Photoshop, can be manipulated without our knowing.”

When Adobe demonstrated yet another form of voice-faking software last year — one that rearranges words in pre-recorded audio clips — a technology researcher at the University of Stirling expressed horror to the BBC. “It seems that Adobe’s programmers were swept along with the excitement of creating something as innovative as a voice manipulator,” Eddy Borges Rey told the outlet, “and ignored the ethical dilemmas brought up by its potential misuse”.

The creators of Lyrebird said they want their technology to be used for good: “Giving back the voice to people who lost it to sickness, being able to record yourself at different stages in your life and hearing your voice later on,” one of Lyrebird’s developers told Gizmodo.

—By arrangement with The Washington Post

Published in Dawn, May 12th, 2017

Opinion

Editorial

Iran stalemate
Updated 02 May, 2026

Iran stalemate

THE US and Iran are currently somewhere between war and peace. While a tenuous ceasefire — extended largely due to...
Tax shortfall
02 May, 2026

Tax shortfall

THE Rs684bn shortfall in tax collection during the first 10 months of the fiscal year is a continuation of a...
Teaching inclusion
02 May, 2026

Teaching inclusion

DISCRIMINATORY and exclusionary content in Punjab’s textbooks has been flagged in Inclusive Education for a United...
Water vision
01 May, 2026

Water vision

WATER insecurity in Pakistan has been building up for decades as per capita water availability has declined from...
Vaccine policy
01 May, 2026

Vaccine policy

PAKISTAN has finally approved its first National Vaccine Policy; a step the health ministry has rightly described as...
Labour rights
Updated 01 May, 2026

Labour rights

THE annual observance of May Day should move beyond statements about the state’s commitment to the rights of...