The day is fast approaching when generative AI won’t only write and create images in a convincingly human-like style, but compose music and sounds that pass for a professional’s work, too.
This morning, Meta announced AudioCraft, a framework to generate what it describes as “high-quality,” “realistic” audio and music from short text descriptions, or prompts. It’s not Meta’s first foray into audio generation — the tech giant open sourced an AI-powered music generator, MusicGen, in June — but Meta claims that it’s made advances that vastly improve the quality of AI-generated sounds, such as dogs barking, cars honking and footsteps on a wooden floor.
In a blog post shared with TechCrunch, Meta explains that the AudioCraft framework was designed to simplify the use of generative models for audio compared to prior work in the field (e.g. Riffusion, Dance Diffusion and OpenAI’s Jukebox). AudioCraft, the code for which is available in open source, provides a collection