AI-generated music is already an innovative enough concept, but Riffusion takes it to another level with a clever, weird approach that produces weird and compelling music using not audio but images of audio.
Sounds strange, is strange. But if it works, it works. And it does work! Kind of.
Diffusion is a machine learning technique for generating images that supercharged the AI world over the last year. DALL-E 2 and Stable Diffusion are the two most high-profile models that work by gradually replacing visual noise with what the AI thinks a prompt ought to look like.
The method has proved powerful in many contexts and is very susceptible to fine-tuning, where you give the mostly trained model a lot of a specific kind of content in order to have it specialize in producing more examples of that content. For instance, you could fine-tune it on watercolors or on photos of cars, and it