NVIDIA’s new text-to-speech model to make AI voices more realistic
Category: #technology  By Pranali Mehta  Date: 2021-09-02
  • share
  • Twitter
  • Facebook
  • LinkedIn
NVIDIA’s new text-to-speech model to make AI voices more realistic

American multinational technology company, Nvidia Corporation has introduced a new tool that can capture natural speech qualities by allowing users to train the AI solution using their own voice.

According to sources, the company’s text-to-speech research team has built a new model called ‘RAD-TTS’ that allows an individual to train a text-to-speech system using their voice, covering details like pacing, timbre, tonality, and others.

A key feature of RAD-TTS is voice-conversion which allows the user to present words of one speaker using another person’s voice. This interface allows greater control over the energy, duration, and pitch of a synthesized voice.

By providing this latest technology, researchers at NVIDIA have developed a more conversational-noise narration for its ‘I AM AI’ video series that uses synthesized instead of human voices.

According to a statement by NVIDIA, through the interface, the firm’s video producers will be able to record themselves while reading the video script and then use the AI system to convert a male speech into a female narrator’s voice.

By using the baseline narration, the producer can direct the AI similar to a voice actor- slightly adjusting the synthesized voice to emphasize certain words, and enhance the pace of the narration to prominently express the video’s tone.

According to NVIDIA, most of the models are trained using tens and thousands of hours of audio data present on its DGX systems. Additionally, developers will be allowed to set any model according to their need, accelerate training using mixed-precision computing on the company’s Tensor Core GPUs.

Citing sources, NVIDIA is dispensing some of its research to run efficiently on its GPUs to anyone who wishes to try it through open-source via the NeMo Python toolkit built for GPU-led conversational AI, present on the tech giant’s NGC pool of software and containers.

Source Credit: https://techcrunch.com/2021/08/31/nvidias-latest-tech-makes-ai-voices-more-expressive-and-realistic/

 

 

  • share
  • Twitter
  • Facebook
  • LinkedIn

About Author

Pranali Mehta

Pranali Mehta    

Pranali Mehta boasts of over three years of experience as a content writer. Having completed her graduation in chemical engineering, she worked as safety & environment associate in a chemical company for a year. Harnessing her passion for writing however, Pranali deci...

Read More >>

More News By Pranali Mehta

Petrofac collabs with Hitachi over offshore joint grid integration
Petrofac collabs with Hitachi over offshore joint grid integration
By Pranali Mehta

Petrofac, an international energy services company, and Hitachi Energy have recently collaborated to provide a joint grid integration and linked infrastructure for supporting the fast-growing offshore wind market. As per sources, this collaboratio...

Street Lighting Control System Market Size, Share to Generate Lucrative Revenues By 2028
Street Lighting Control System Market Size, Share to Generate Lucrative Revenues By 2028
By Pranali Mehta

The research report on Street Lighting Control System market Added by Market Study Report proposes a comprehensive study on the recent industry trends. In addition, the report presents a detailed abstract of the growth statistics, revenue estimation,...

Medical Crowdfunding Platform Market Size, Share to Witness Notable Hike Over 2022-2028
Medical Crowdfunding Platform Market Size, Share to Witness Notable Hike Over 2022-2028
By Pranali Mehta

The latest report on ' Medical Crowdfunding Platform market' Added by Market Study Report provides a concise analysis of the industry size, revenue forecast and regional spectrum of this business. The report further illustrates the major challenges a...