Audiobox is Meta’s new foundation research model for audio generation. It can generate voices and sound effects using a combination of voice inputs and natural language text prompts — making it easy to create custom audio for a wide range of use cases. The Audiobox family of models also includes specialist models Audiobox Speech and Audiobox Sound, and all Audiobox models are built upon the shared self-supervised model Audiobox SSL.
A series of interactive audio demos to help you understand the unique capabilities of Audiobox. You can experiment with each capability individually.