NVIDIA’s new text-to-speech model to make AI voices more realistic
Category: #technology  By Pranali Mehta  Date: 2021-09-02
  • share
  • Twitter
  • Facebook
  • LinkedIn
NVIDIA’s new text-to-speech model to make AI voices more realistic

American multinational technology company, Nvidia Corporation has introduced a new tool that can capture natural speech qualities by allowing users to train the AI solution using their own voice.

According to sources, the company’s text-to-speech research team has built a new model called ‘RAD-TTS’ that allows an individual to train a text-to-speech system using their voice, covering details like pacing, timbre, tonality, and others.

A key feature of RAD-TTS is voice-conversion which allows the user to present words of one speaker using another person’s voice. This interface allows greater control over the energy, duration, and pitch of a synthesized voice.

By providing this latest technology, researchers at NVIDIA have developed a more conversational-noise narration for its ‘I AM AI’ video series that uses synthesized instead of human voices.

According to a statement by NVIDIA, through the interface, the firm’s video producers will be able to record themselves while reading the video script and then use the AI system to convert a male speech into a female narrator’s voice.

By using the baseline narration, the producer can direct the AI similar to a voice actor- slightly adjusting the synthesized voice to emphasize certain words, and enhance the pace of the narration to prominently express the video’s tone.

According to NVIDIA, most of the models are trained using tens and thousands of hours of audio data present on its DGX systems. Additionally, developers will be allowed to set any model according to their need, accelerate training using mixed-precision computing on the company’s Tensor Core GPUs.

Citing sources, NVIDIA is dispensing some of its research to run efficiently on its GPUs to anyone who wishes to try it through open-source via the NeMo Python toolkit built for GPU-led conversational AI, present on the tech giant’s NGC pool of software and containers.

Source Credit: https://techcrunch.com/2021/08/31/nvidias-latest-tech-makes-ai-voices-more-expressive-and-realistic/

 

 

  • share
  • Twitter
  • Facebook
  • LinkedIn

About Author

Pranali Mehta

Pranali Mehta    

Pranali Mehta boasts of over three years of experience as a content writer. Having completed her graduation in chemical engineering, she worked as safety & environment associate in a chemical company for a year. Harnessing her passion for writing however, Pranali deci...

Read More >>

More News By Pranali Mehta

Mercedes Benz backs Tesla’s demands of reducing import duties in India
Mercedes Benz backs Tesla’s demands of reducing import duties in India
By Pranali Mehta

Luxury carmaker Mercedes Benz voicing support for Telsa’s tax reduction demands, stating the import duties as ‘outrageous’. The German company said that customers in India could end up purchasing its cars at double the rate as in th...

GSK injects £50 million in renewable energy for NYC Climate Week
GSK injects £50 million in renewable energy for NYC Climate Week
By Pranali Mehta

Pharmaceutical major, GlaxoSmithKline (GSK) has recently provided updates on its efforts to meet its environmental goals, which includes an investment of £50 million in renewable electricity for its manufacturing locations in the UK and US. Th...

Netflix introduces free streaming plan in Kenya to expand its reach
Netflix introduces free streaming plan in Kenya to expand its reach
By Pranali Mehta

Netflix Inc. has reportedly unveiled its free mobile streaming plan in Kenya to entice new customers in the East African nation which currently has more than 20 million internet users. The plan will be rolled out in Kenya and is expected to be avail...