A DeepMinda company that is part of the holding Alphabetowner of Googlerevealed this Tuesday (18) a new AI tool capable of generating soundtracks, sound effects and even dialogue for videos based on the video content.

Sounding videos

The project is classified as V2A (video-to-audio, or video to audio, in free translation). Just like the awaited Sora, OpenAI technology capable of creating realistic videos, Google’s new tool is not available to the public for now. The project is restricted to private testing.

Google’s idea is to have a solution that can be integrated into video generation models, which usually do not sound the generated materials. DeepMind itself has its own generative AI solution for video creation, Veo.

“Our V2A technology is combined with video generation models like Veo to create takes with a dramatic soundtrack, realistic sound effects or dialogue that matches the characters and tone of a video”, explains Google on the project’s official page .

Check out an example released by Google below of its new sound AI solution in action. The video below was generated with the following prompt: “a spaceship crosses the vastness of space, stars passing by, high speed, science fiction”.

The model can also generate soundtracks for a variety of traditional images, including archival material, silent films and more – opening up a wider range of creative opportunities, the company explains.

You can also remove sounds from a video

In addition to adding tracks, Google’s generative model also does the opposite, removing unwanted sounds from videos.

The company explains that the process begins with the video being encoded into a compressed representation. The diffusion model then continuously refines the audio to isolate it from random noise, a process accompanied by the video. The audio is then transformed into a waveform and combined with the video data, ensuring its synchronization.

The project still has a long way to go until public launch

The project is in the experimental phase, there is still a lot to be done and refined, including an advance in relation to the synchronization of the AI-generated dialogue track with the characters’ lip movements.

Google has not yet given a perspective on when the launch of this tool could happen.

Source: https://www.hardware.com.br/noticias/essa-e-a-nova-ia-do-google-que-gera-sons-para-videos.html



Leave a Reply

Your email address will not be published. Required fields are marked *