Quick Start
Get up and running with Voxtral in minutes. Choose your preferred method:
Installation
Install Voxtral using your preferred package manager:
Python (pip)
pip install voxtral
Node.js (npm)
npm install @voxtral/ai
Docker
docker pull voxtral/voxtral-mini:latest
Basic Usage
Here's how to use Voxtral for common tasks:
Transcription
import voxtral
# Transcribe audio file
result = voxtral.transcribe("audio.mp3")
print(result.text)
# Transcribe with options
result = voxtral.transcribe(
"audio.mp3",
language="en",
task="transcribe",
timestamp_granularities=["word"]
)
Audio Understanding
# Ask questions about audio content
result = voxtral.understand(
"audio.mp3",
question="What is the main topic discussed?"
)
# Generate summary
summary = voxtral.summarize("audio.mp3")
Multilingual Support
# Automatic language detection
result = voxtral.transcribe("spanish_audio.mp3")
# Force specific language
result = voxtral.transcribe(
"audio.mp3",
language="es"
)
API Reference
Authentication
All API requests require authentication using your API key:
Authorization: Bearer YOUR_API_KEY
Get your API key from the Voxtral Dashboard.
Transcription API
Endpoint: POST /v1/transcribe
Request Parameters
Parameter | Type | Required | Description |
---|---|---|---|
file | File | Yes | Audio file (MP3, WAV, M4A, etc.) |
language | String | No | Language code (auto-detected if not provided) |
task | String | No | "transcribe" or "translate" |
Response
{
"text": "Transcribed text content...",
"language": "en",
"duration": 120.5,
"segments": [
{
"start": 0.0,
"end": 5.2,
"text": "Segment text..."
}
]
}
Audio Understanding API
Endpoint: POST /v1/understand
Request Parameters
Parameter | Type | Required | Description |
---|---|---|---|
file | File | Yes | Audio file |
question | String | Yes | Question about the audio content |
Deployment
Local Deployment
Deploy Voxtral Mini locally for privacy and control:
System Requirements
- GPU: NVIDIA GPU with 10GB+ VRAM (RTX 3080 or better)
- RAM: 16GB+ system memory
- Storage: 20GB+ free space
- OS: Linux, macOS, or Windows
Installation Steps
# Clone repository
git clone https://github.com/mistralai/voxtral.git
cd voxtral
# Install dependencies
pip install -r requirements.txt
# Download model
python -c "from voxtral import VoxtralMini; VoxtralMini.download()"
# Start server
python -m voxtral.server --port 8000
Docker Deployment
Use Docker for easy deployment:
# Pull image
docker pull voxtral/voxtral-mini:latest
# Run container
docker run -d \
--name voxtral \
--gpus all \
-p 8000:8000 \
voxtral/voxtral-mini:latest
Code Examples
Python Examples
Basic Transcription
import voxtral
# Initialize client
client = voxtral.Client(api_key="your-api-key")
# Transcribe file
with open("audio.mp3", "rb") as f:
result = client.transcribe(f)
print(result.text)
Streaming Transcription
# Real-time transcription
for chunk in client.transcribe_stream(audio_stream):
print(chunk.text, end="", flush=True)
JavaScript Examples
Browser Usage
import { Voxtral } from '@voxtral/ai';
const client = new Voxtral('your-api-key');
// Transcribe audio file
const file = document.getElementById('audio-file').files[0];
const result = await client.transcribe(file);
console.log(result.text);