Stability AI presents Stable Audio: now music is generated with AI

September 27, 2023

Stability AI has announced the launch of Stable Audio, the company’s first AI-based product for generating music and sound. Stable Audio is a unique product that uses the latest generative AI techniques to create music quickly. Stability AI offers both a free basic version of Stable Audio, which can be used to generate and download tracks of up to 20 seconds, and a ‘Pro’ subscription, which offers downloadable 90-second tracks for commercial projects. Here is a statement from Emad Mostaque, CEO of Stability AI. “As the only independent, open, multimodal company in the field of generative artificial intelligence, we are excited to use our expertise to develop a product to support music creators. Our hope is that Stable Audio will enable music fans and creative professionals to generate new content with the help of AI, and we look forward to the endless innovations it will inspire.”

Stable Audio is ideal for musicians who wish to create samples for use in their music, but the opportunities for creators are unlimited. Music tracks are generated in response to text descriptions provided by the user, along with the desired composition length. For example, you can enter “Post-Rock, Guitars, Drums, Bass, Strings, Euphoric, Upbeat, Melancholy, Fluid, Raw, Epic, Sentimental, 125 BPM” with a request for a 95-second track, and the system will generate this type of content. The underlying model was trained using music and metadata from AudioSparx, a leading music library. Stable Audio is the first music generation product that enables the creation of high-quality 44.1 kHz music for commercial use. Those who would like to try it out can do so easily on the official page.

Quality improved

One of the most impressive aspects of Stable Audio is the audio quality of the generated samples. This model significantly improves over previous artificial intelligence-based audio generators. The promise of being able to type in descriptions such as “introductory music for a horror movie” or “sound of car wheels on asphalt” and get high-quality sound results is enticing. It could revolutionize the way audio content is created, useful not only for music projects but also for short films and visual productions that need sound support. Stable Audio is not the first music generator based on latent diffusion techniques, but it is certainly one of the most advanced. The 44.1 kHz stereo audio is a significant step forward from previous models and promises to take AI-generated audio to new levels of fidelity.

The big question is whether music artists and industry professionals will accept or reject this new technology. The history of protests in the field of visual arts and dubbing suggests that artificial intelligence may not completely replace humans in the creative process. Still, it could become a powerful tool for audio production professionals. AI-generated audio could become a complement to human creativity, enabling new forms of audio expression and production. In any case, Stable Audio marks an important step in the direction of innovation in AI-generated audio.