Voxtral Documentation

Quick Start

Get up and running with Voxtral in minutes. Choose your preferred method:

API (Recommended)

Use our hosted API for the fastest setup:

curl -X POST https://api.gamealpaca.world/v1/transcribe \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: multipart/form-data" \
  -F "[email protected]"

Local Deployment

Run Voxtral Mini locally for privacy and control:

pip install voxtral
voxtral transcribe audio.mp3

Hugging Face

Download and use models directly:

from transformers import AutoModelForSpeechSeq2Seq
model = AutoModelForSpeechSeq2Seq.from_pretrained("mistralai/Voxtral-Mini")

Installation

Install Voxtral using your preferred package manager:

Python (pip)

pip install voxtral

Node.js (npm)

npm install @voxtral/ai

Docker

docker pull voxtral/voxtral-mini:latest

Basic Usage

Here's how to use Voxtral for common tasks:

Transcription

import voxtral

# Transcribe audio file
result = voxtral.transcribe("audio.mp3")
print(result.text)

# Transcribe with options
result = voxtral.transcribe(
    "audio.mp3",
    language="en",
    task="transcribe",
    timestamp_granularities=["word"]
)

Audio Understanding

# Ask questions about audio content
result = voxtral.understand(
    "audio.mp3",
    question="What is the main topic discussed?"
)

# Generate summary
summary = voxtral.summarize("audio.mp3")

Multilingual Support

# Automatic language detection
result = voxtral.transcribe("spanish_audio.mp3")

# Force specific language
result = voxtral.transcribe(
    "audio.mp3",
    language="es"
)

API Reference

Authentication

All API requests require authentication using your API key:

Authorization: Bearer YOUR_API_KEY

Get your API key from the Voxtral Dashboard.

Transcription API

Endpoint: POST /v1/transcribe

Request Parameters

Parameter	Type	Required	Description
file	File	Yes	Audio file (MP3, WAV, M4A, etc.)
language	String	No	Language code (auto-detected if not provided)
task	String	No	"transcribe" or "translate"

Response

{
  "text": "Transcribed text content...",
  "language": "en",
  "duration": 120.5,
  "segments": [
    {
      "start": 0.0,
      "end": 5.2,
      "text": "Segment text..."
    }
  ]
}

Audio Understanding API

Endpoint: POST /v1/understand

Request Parameters

Parameter	Type	Required	Description
file	File	Yes	Audio file
question	String	Yes	Question about the audio content

Deployment

Local Deployment

Deploy Voxtral Mini locally for privacy and control:

System Requirements

GPU: NVIDIA GPU with 10GB+ VRAM (RTX 3080 or better)
RAM: 16GB+ system memory
Storage: 20GB+ free space
OS: Linux, macOS, or Windows

Installation Steps

# Clone repository
git clone https://github.com/mistralai/voxtral.git
cd voxtral

# Install dependencies
pip install -r requirements.txt

# Download model
python -c "from voxtral import VoxtralMini; VoxtralMini.download()"

# Start server
python -m voxtral.server --port 8000

Docker Deployment

Use Docker for easy deployment:

# Pull image
docker pull voxtral/voxtral-mini:latest

# Run container
docker run -d \
  --name voxtral \
  --gpus all \
  -p 8000:8000 \
  voxtral/voxtral-mini:latest

Code Examples

Python Examples

Basic Transcription

import voxtral

# Initialize client
client = voxtral.Client(api_key="your-api-key")

# Transcribe file
with open("audio.mp3", "rb") as f:
    result = client.transcribe(f)
    print(result.text)

Streaming Transcription

# Real-time transcription
for chunk in client.transcribe_stream(audio_stream):
    print(chunk.text, end="", flush=True)

JavaScript Examples

Browser Usage

import { Voxtral } from '@voxtral/ai';

const client = new Voxtral('your-api-key');

// Transcribe audio file
const file = document.getElementById('audio-file').files[0];
const result = await client.transcribe(file);
console.log(result.text);

Quick Start

API (Recommended)

Local Deployment

Hugging Face

Installation

Python (pip)

Node.js (npm)

Docker

Basic Usage

Transcription

Audio Understanding

Multilingual Support

API Reference

Authentication

Transcription API

Request Parameters

Response

Audio Understanding API

Request Parameters

Deployment

Local Deployment

System Requirements

Installation Steps

Docker Deployment

Code Examples

Python Examples

Basic Transcription

Streaming Transcription

JavaScript Examples

Browser Usage

Ready to Get Started?