Mistral Unveils Voxtral Open-Source AI Voice Model
French AI startup Mistral has launched Voxtral, an open-source audio model for high-quality speech recognition, semantic understanding, native multilingualism, and context processing.
Voxtral is available in two variants: Small, a 24B model for production environments, and Mini, a 3B model for local and edge deployments. Both versions, which are available under an Apache 2.0 license, are powered by Mistral Small 3.1 to handle tasks like summarizing audio or answering questions from voice commands.
The Voxtral models enable up to 30 minutes for transcription or 40 minutes for understanding analysis. Additionally, they feature built-in question-and-answer and summarization functionality and can trigger back-end functions, workflows, or API calls based on spoken user intentions.
Voxtral automatically recognizes languages and supports English, Spanish, French, Portuguese, Hindi, German, Dutch, and Italian.
Mistral claims that in independent testing, Voxtral Small outperformed similar technologies, including Whisper large-v3, OpenAI's GPT-4o mini Transcribe and Gemini 2.5 Flash.