API Reference

Base URL: https://api.leanvox.com

Try endpoints interactively in the OpenAPI playground.

CLI

Generate speech directly from your terminal with lvox.

Install

macOS / Linux (Homebrew)

brew install leanvox/tap/lvox

macOS / Linux (Shell)

curl --proto '=https' --tlsv1.2 -LsSf https://github.com/leanvox/lvox/releases/latest/download/lvox-installer.sh | sh

Windows (PowerShell)

irm https://github.com/leanvox/lvox/releases/latest/download/lvox-installer.ps1 | iex

Login

lvox auth login YOUR_API_KEY

Saves your API key to ~/.lvox/config.toml. Or set LEANVOX_API_KEY env var.

Generate Speech

# Basic usage
lvox generate "Hello world!" -o hello.mp3

# With voice and model
lvox gen "Welcome to Leanvox!" --voice af_heart --model standard -o welcome.mp3

# Pro model with emotion
lvox gen "This is amazing! [laugh]" --model pro --voice podcast_conversational_female -o emotion.mp3

# From text file
lvox gen --file script.txt -o narration.mp3

# From EPUB (book → audiobook!)
lvox gen --file my-book.epub --voice af_heart -o audiobook.mp3

# Pipe from stdin
echo "Hello from stdin" | lvox gen -o piped.mp3

# Long text (>10K chars) — async generation
lvox gen --use-async --file novel-chapter.txt --model pro --voice podcast_conversational_female -o chapter.mp3
# Submits async job → polls for progress → downloads audio when done

Streaming

# Stream audio (writes chunks as they arrive)
lvox stream "Hello world!" -o hello.mp3

Dialogue

# Multi-speaker dialogue from JSON file
lvox dialogue --file dialogue.json --model pro -o podcast.mp3

# With custom gap between speakers
lvox dialogue --file dialogue.json --gap-ms 400 -o output.mp3

# Inline JSON
lvox dialogue '[{"text":"Hi!","voice":"podcast_conversational_female"},{"text":"Hello!","voice":"podcast_casual_male"}]' -o chat.mp3

Transcription

# Transcribe audio with speaker diarization
lvox transcribe meeting.mp3

# With summary and action items
lvox transcribe meeting.mp3 --features '["transcript","diarization","summary"]'

# Specify language and speaker count
lvox transcribe podcast.mp3 --language en --num-speakers 2

# Schedule a background transcription and exit
lvox transcribe long-meeting.mp3 --no-wait

# Force a background job, then wait for completion
lvox transcribe meeting.mp3 --force-async --wait

# JSON output (for piping to dialogue)
lvox transcribe audio.mp3 --json > transcript.json

Async Generation (Long-Form)

Sync generation supports up to 10,000 Unicode characters. For longer texts, use --use-async — the text is automatically split into 2K-character chunks, each generated separately, then concatenated into a single audio file.

# Generate from a long text file (auto-chunked)
lvox generate --use-async --file novel-chapter.txt --model pro --voice podcast_conversational_female -o chapter.mp3
# ℹ Submitting async job (18000 characters)...
#   Job ID: 01a816cb-5e2d-4a1e-aca5-64e0e8a9f9d6
#   Est. Cost: $0.18
#   Processing (5/10)...
# ✓ Audio saved to chapter.mp3

# Without --use-async flag, texts >10K chars show a helpful error:
lvox generate --file long-book.txt --model pro
# ✗ Text is 52,000 characters (max 10,000 for sync generation).
#   Add --use-async for texts longer than 10,000 characters

Manage Async Jobs

# List all async jobs
lvox jobs list
# ID                                     TYPE   STATUS       MODEL        PROGRESS   DETAIL                 CREATED
# 01a816cb-...                           tts    completed    pro          10/10      podcast_interview_f    2026-03-10T09:55
# 8b31de77-...                           stt    processing   transcribe   —          STT en                 2026-05-11T09:55

# List only transcription jobs
lvox jobs list --type stt

# Get job details (progress, audio URL, cost)
lvox jobs get 01a816cb-5e2d-4a1e-aca5-64e0e8a9f9d6
# ID: 01a816cb-5e2d-4a1e-aca5-64e0e8a9f9d6
# Status: completed
# Progress: 10/10 chunks
# Cost: 18¢

# JSON output
lvox jobs list --json

Other Commands

# List voices
lvox voices list
lvox voices list --model pro
lvox voices curated

# Check balance
lvox balance
# Balance: $250.64
# Today — Requests: 14 | Characters: 198.1K | Cost: $2.01

# View usage history
lvox usage --days 7

# List past generations
lvox generations

# JSON output (all commands)
lvox balance --json

Full CLI reference: lvox --help or lvox <command> --help

Authentication

All API requests require an API key passed via the Authorization header.

Authorization: Bearer lv_live_YOUR_API_KEY

Generate API keys from the dashboard. Keys use the lv_live_ prefix and are hashed server-side — save them when created.

Register

POST /v1/auth/register

{ "email": "[email protected]", "password": "your-password" }

Login

POST /v1/auth/login

{ "email": "[email protected]", "password": "your-password" }

Returns: { "token": "jwt...", "user": { "id", "email" } }

Generate Speech

POST /v1/tts/generate

{
  "text": "Hello world!",
  "model": "standard",     // "standard", "pro", or "max"
  "voice": "af_heart",     // voice ID
  "language": "en",        // ISO 639-1 code
  "format": "mp3",         // optional: "mp3" (default) or "wav"
  "exaggeration": 0.5      // optional (pro only): 0.0 - 1.0
}

For model: "max", send voice_instructions instead of voice.

Response

{
  "audio_url": "https://cdn.leanvox.com/audio/abc123.mp3",
  "model": "standard",
  "voice": "af_heart",
  "characters": 12,
  "cost_cents": 0,
  "cost_billing_units": 1,
  "cost_display": "0.05¢",
  "balance_cents": 100,
  "balance_billing_units": 1999,
  "balance_display": "99.95¢"
}

Models

Model	Best For	Price	Features
standard	Everyday TTS	$0.005/1K (~1 min)	Fast, 54+ voices, 8 languages
pro	Emotions & expressions	$0.01/1K (~1 min)	Voice cloning, emotion tags, 23 languages
max	Creative voice control	$0.03/1K (~1 min)	Natural language instructions, ~400ms TTFA, 18 languages

Note: Minimum 100 characters billed per request. Requests under 100 characters are rounded up.

Streaming

Stream audio as it's generated instead of waiting for the full response. Supports Standard, Pro, and Max models.

POST /v1/tts/stream

{
  "text": "Hello world!",
  "model": "standard",     // "standard", "pro", or "max"
  "voice": "af_heart",     // voice ID
  "language": "en",        // ISO 639-1 code
  "exaggeration": 0.5      // optional (pro only): 0.0 - 1.0
}

Same request body as /v1/tts/generate. For max, use voice_instructions. Streaming always returns MP3.

Response

Returns a raw audio/mpeg byte stream. Cost and billing metadata are in the response headers:

Header	Description
`X-Leanvox-Request-Id`	Unique request identifier
`X-Leanvox-Cost-Cents`	Credits charged (in cents)
`X-Leanvox-Cost-Display`	Exact charge display, including sub-cent usage like `0.05¢`
`X-Leanvox-Balance-Cents`	Remaining balance after charge
`X-Leanvox-Balance-Display`	Exact remaining balance display, including sub-cent carry
`X-Leanvox-Characters`	Character count of input text

Example (curl)

curl -N -X POST https://api.leanvox.com/v1/tts/stream \
  -H "Authorization: Bearer lv_live_YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{"text": "Hello world!", "model": "standard", "voice": "af_heart"}' \
  --output speech.mp3

Example (JavaScript)

const res = await fetch("https://api.leanvox.com/v1/tts/stream", {
  method: "POST",
  headers: {
    "Authorization": "Bearer lv_live_YOUR_API_KEY",
    "Content-Type": "application/json",
  },
  body: JSON.stringify({ text: "Hello world!", model: "standard" }),
});

// Pipe to audio element via MediaSource API or save as blob
const blob = await res.blob();
const audioUrl = URL.createObjectURL(blob);

Dialogue

Generate multi-speaker dialogue in a single request. Supports Standard, Pro, and Max models.

POST /v1/tts/dialogue

{
  "model": "pro",
  "lines": [
    { "text": "Hi there!", "voice": "podcast_conversational_female", "language": "en" },
    { "text": "Hello! [laugh]", "voice": "podcast_casual_male", "language": "en", "exaggeration": 0.7 }
  ],
  "gap_ms": 500    // silence between lines (default 500)
}

File Upload

Extract plain text from .txt or .epub files so you can pipe the result straight into TTS generation. Useful for narrating long documents, ebooks, or anything you'd rather not paste by hand.

The CLI handles this automatically — pass any .epub or .txt file via --file and it extracts the text locally before sending. The API endpoint below is for web clients or custom integrations that need server-side extraction.

Extract Text from File

POST /v1/files/extract-text

Multipart form upload. Auth required. Max 5 MB. Returns up to 500,000 characters.

// Request: multipart/form-data with field "file"
// Accepted types: .txt, .epub

// Response
{
  "text": "Chapter 1. It was a dark and stormy night...",
  "filename": "my-book.epub",
  "char_count": 142830,
  "truncated": false   // true if file exceeded 500K chars
}

Examples

cURL

curl -X POST https://api.leanvox.com/v1/files/extract-text \
  -H "Authorization: Bearer lv_live_..." \
  -F "[email protected]"

JavaScript (browser)

const form = new FormData();
form.append('file', fileInput.files[0]);

const res = await fetch('https://api.leanvox.com/v1/files/extract-text', {
  method: 'POST',
  headers: { Authorization: 'Bearer lv_live_...' },
  body: form,
});

const { text } = await res.json();
// Now pass text to /v1/tts/generate or /v1/tts/stream

CLI (handles extraction automatically)

# .txt file
lvox gen --file chapter.txt --voice af_heart -o chapter.mp3

# .epub file — parsed locally, spine-ordered, HTML stripped
lvox gen --file my-book.epub --voice af_heart -o audiobook.mp3

Voices

List Voices

GET /v1/voices?model=standard

Curated Pro Voices

GET /v1/voices/curated

Returns 40 curated AI-designed Pro voices across 11 categories with preview audio URLs.

Clone Voice (Pro)

POST /v1/voices/clone

{
  "name": "My Voice",
  "audio_base64": "<base64-encoded WAV>",
  "description": "optional description"
}

Upload a 5-30 second WAV clip. Returns a voice_id for use with the Pro model.

Unlock Voice

POST /v1/voices/{voice_id}/unlock

Unlock a cloned voice for TTS use. Free — no credit charge.

Design Voice

POST /v1/voices/design

{
  "name": "Deep Narrator",
  "description": "A deep, warm male voice for audiobooks",
  "language": "en",
  "notes": "optional extra guidance"
}

AI-generates a custom voice from a text description. Free — no credit charge.

Delete Voice

DELETE /v1/voices/{voice_id}

Generations

Browse and manage your past TTS generations.

List Generations

GET /v1/generations?limit=20&offset=0

{
  "generations": [
    {
      "id": "uuid",
      "generation_type": "tts",
      "input_text": "Hello world!",
      "model": "standard",
      "voice": "af_heart",
      "format": "mp3",
      "characters": 12,
      "cost_cents": 1,
      "created_at": "2025-01-15T10:30:00Z"
    }
  ],
  "total": 42,
  "limit": 20,
  "offset": 0
}

Get Generation Audio

GET /v1/generations/{id}/audio

{ "audio_url": "https://cdn.leanvox.com/audio/abc123.mp3?token=..." }

Returns a presigned URL valid for 1 hour.

Delete Generation

DELETE /v1/generations/{id}

Permanently deletes the generation and its audio file from storage.

Async Jobs

For long texts and large transcription files, Leanvox uses async jobs. You get a job ID immediately and poll /v1/jobs/{job_id} for status and results.

Note: The 10,000 character limit counts Unicode characters, not bytes. Non-ASCII text (e.g., Chinese, Arabic, accented Latin) is counted correctly. For texts exceeding the limit, use async jobs which handle chunking automatically.

Create Async Job

POST /v1/tts/generate/async

Same body as /v1/tts/generate. Optional "webhook_url" for completion callback.

Check Job Status

GET /v1/jobs/{job_id}

{
  "job_id": "uuid",
  "job_type": "tts" | "stt",
  "status": "pending" | "processing" | "completed" | "failed",
  "audio_url": "...",      // when completed
  "result": { ... },       // completed STT jobs only
  "error": "...",          // when failed
  "created_at": "..."
}

Legacy /v1/tts/jobs endpoints remain available for older TTS clients.

Account

Get Balance

GET /v1/account/balance

{ "balance_cents": 450, "total_spent_cents": 50 }

Usage History

GET /v1/account/usage?days=30&model=standard&limit=100

Buy Credits

POST /v1/account/credits

{ "amount_cents": 2000 }

Returns a Stripe checkout URL. Tiers: $5 (0%), $20 (+10%), $50 (+15%), $100 (+20% bonus).

Errors

All errors follow a consistent format:

{ "error": { "code": "error_code", "message": "Human-readable message" } }

HTTP	Code	Meaning
400	invalid_request	Bad parameters
401	invalid_api_key	Missing or invalid key
402	insufficient_balance	Not enough credits
404	not_found	Resource not found
429	rate_limit_exceeded	10/min (free) or 60/min (paid)
500	server_error	Internal error

Python SDK

Official Python client for the Leanvox API. Sync and async support, zero config.

Install

pip install leanvox

Quick Start

Sync

from leanvox import Leanvox

client = Leanvox()  # uses LEANVOX_API_KEY env var

result = client.generate(text="Hello from Leanvox!")
print(result.audio_url)

# Save directly to file
result.save("hello.mp3")

Async

from leanvox import AsyncLeanvox

async with AsyncLeanvox() as client:
    result = await client.generate(text="Hello async world!")
    print(result.audio_url)

Generate Speech

result = client.generate(
    text="Welcome to the future of voice.",
    model="pro",         # "standard", "pro", or "max"
    voice="podcast_conversational_female",
    language="en",
    exaggeration=0.5,    # pro only: 0.0 - 1.0
)
result.save("welcome.mp3")

For model="max", use voice_instructions="..." instead of voice.

Stream Audio

with client.stream(text="A long narration about the universe...") as stream:
    with open("narration.mp3", "wb") as f:
        for chunk in stream:
            f.write(chunk)

Streaming always outputs MP3.

Dialogue

result = client.dialogue(
    model="pro",
    lines=[
        {"text": "Welcome to the podcast!", "voice": "podcast_conversational_female"},
        {"text": "Thanks for having me.", "voice": "podcast_casual_male"},
        {"text": "Let's dive right in.", "voice": "podcast_conversational_female"},
    ],
    gap_ms=400,
)
result.save("podcast.mp3")

Authentication

# Priority: constructor → env var → config file
client = Leanvox(api_key="lv_live_...")       # 1. Explicit
# export LEANVOX_API_KEY="lv_live_..."        # 2. Env var
# ~/.lvox/config.toml                         # 3. Config file

Error Handling

from leanvox import (
    Leanvox,
    InsufficientBalanceError,
    RateLimitError,
    LeanvoxError,
)

try:
    result = client.generate(text="Hello!")
except InsufficientBalanceError as e:
    print(f"Low balance: ${e.balance_cents / 100:.2f}")
except RateLimitError as e:
    print(f"Retry after {e.retry_after}s")
except LeanvoxError as e:
    print(f"API error [{e.code}]: {e.message}")

PyPI · GitHub

Node.js SDK

Official Node.js/TypeScript client. Full type safety, zero runtime dependencies, Node.js 18+.

Install

npm install leanvox

Quick Start

import { Leanvox } from "leanvox";

const client = new Leanvox();  // uses LEANVOX_API_KEY env var

const result = await client.generate({ text: "Hello from Leanvox!" });
console.log(result.audioUrl);

// Save directly to file
await result.save("hello.mp3");

Generate Speech

const result = await client.generate({
  text: "Welcome to the future of voice.",
  model: "pro",         // "standard", "pro", or "max"
  voice: "podcast_conversational_female",
  language: "en",
  exaggeration: 0.5,    // pro only: 0.0 - 1.0
});
await result.save("welcome.mp3");

For model: "max", use voiceInstructions instead of voice.

Stream Audio

import { createWriteStream } from "fs";

const stream = await client.stream({
  text: "A long narration about the universe...",
  voice: "af_heart",
});

const writer = createWriteStream("narration.mp3");
for await (const chunk of stream) {
  writer.write(chunk);
}
writer.end();

Streaming always outputs MP3.

Dialogue

const result = await client.dialogue({
  model: "pro",
  lines: [
    { text: "Welcome to the podcast!", voice: "podcast_conversational_female" },
    { text: "Thanks for having me.", voice: "podcast_casual_male" },
    { text: "Let's dive right in.", voice: "podcast_conversational_female" },
  ],
  gapMs: 400,
});
await result.save("podcast.mp3");

Authentication

// Priority: constructor → env var → config file
const client = new Leanvox({ apiKey: "lv_live_..." }); // 1. Explicit
// export LEANVOX_API_KEY="lv_live_..."                 // 2. Env var
// ~/.lvox/config.toml                                  // 3. Config file

Error Handling

import {
  Leanvox,
  InsufficientBalanceError,
  RateLimitError,
  LeanvoxError,
} from "leanvox";

try {
  const result = await client.generate({ text: "Hello!" });
} catch (e) {
  if (e instanceof InsufficientBalanceError) {
    console.log(`Low balance: $${(e.balanceCents / 100).toFixed(2)}`);
  } else if (e instanceof RateLimitError) {
    console.log(`Retry after ${e.retryAfter}s`);
  } else if (e instanceof LeanvoxError) {
    console.log(`API error [${e.code}]: ${e.message}`);
  }
}

npm · GitHub

Max — Instruction-Based TTS

Describe the voice you want in natural language. No presets, no voice IDs — just tell it what to sound like.

💡 Sweet spot: 50–120 characters describing gender + tone + pace. Example: "Warm, confident female narrator, speaking slowly"

Python

audio = client.generate(
    text="Welcome to the future of speech.",
    model="max",
    voice_instructions="Warm, confident male narrator with a deep tone",
    language="en",
)
audio.save("output.mp3")

# Reuse the generated voice
voice_id = audio.generated_voice_id
audio2 = client.generate(text="Part two.", model="max", voice=voice_id)

Node.js

const audio = await client.audio.speech.create({
  model: 'max',
  input: 'Welcome to the future of speech.',
  voiceInstructions: 'Warm, confident male narrator with a deep tone',
  language: 'en',
});
await audio.save('output.mp3');

cURL

curl -X POST https://api.leanvox.com/v1/tts/generate \
  -H "Authorization: Bearer lv_xxx" \
  -H "Content-Type: application/json" \
  -d '{"model": "max", "text": "Welcome to the future of speech.", "voice_instructions": "Warm, confident male narrator with a deep tone", "language": "en"}'

Dialogue with Instructions

audio = client.dialogue(
    model="max",
    lines=[
        {"text": "Welcome back!", "voice_instructions": "Energetic male host, upbeat"},
        {"text": "Great to be here.", "voice_instructions": "Calm older female academic"},
    ],
    gap_ms=400,
)

Transcribe Audio

Convert speech to text with automatic speaker diarization. Supports 99 languages, files up to 500MB. Diarization is always free.

Endpoint

POST /v1/audio/transcribe

Content-Type: multipart/form-data

Parameters

Parameter	Type	Required	Description
`file`	file	✅	Audio file (mp3, wav, ogg, flac, m4a, webm)
`features`	string	—	JSON array: `["transcript","diarization","summary"]`
`language`	string	—	ISO 639-1 code (auto-detect if omitted)
`force_async`	boolean	—	Set `true` to schedule a background STT job
`num_speakers`	integer	—	Expected speaker count (improves accuracy)

Example (cURL)

curl -X POST https://api.leanvox.com/v1/audio/transcribe \
  -H "Authorization: Bearer lv_your_key" \
  -F "[email protected]"

With Summarization

curl -X POST https://api.leanvox.com/v1/audio/transcribe \
  -H "Authorization: Bearer lv_your_key" \
  -F "[email protected]" \
  -F 'features=["transcript","diarization","summary"]'

Response

Short files return 200 OK with the transcript. Large files, or requests with force_async=true, return 202 Accepted with a job ID.

{
  "id": "txn_abc123",
  "duration_seconds": 120.5,
  "language": "en",
  "confidence": 0.94,
  "formatted_transcript": "[SPEAKER_00] Welcome to the show.\n[SPEAKER_01] Thanks for having me.",
  "transcript": {
    "text": "Welcome to the show. Thanks for having me.",
    "segments": [
      {
        "start": 0.0, "end": 2.1,
        "text": "Welcome to the show.",
        "speaker": "SPEAKER_00", "confidence": 0.96
      },
      {
        "start": 2.4, "end": 4.8,
        "text": "Thanks for having me.",
        "speaker": "SPEAKER_01", "confidence": 0.93
      }
    ]
  },
  "speakers": { "count": 2, "labels": ["SPEAKER_00", "SPEAKER_01"] },
  "usage": {
    "duration_minutes": 2.01,
    "tier": "transcribe",
    "cost_cents": 1,
    "balance_cents": 99
  }
}

Async Response

{
  "job_id": "7f0e...",
  "status": "pending",
  "job_type": "stt",
  "poll_url": "/v1/jobs/7f0e...",
  "message": "Your transcription job is scheduled. Track progress with poll_url."
}

Poll for Completion

curl https://api.leanvox.com/v1/jobs/7f0e... \
  -H "Authorization: Bearer lv_your_key"

Summary Response

When summary is included in features:

"summary": {
  "text": "A discussion about the Q1 product roadmap...",
  "action_items": [
    "Follow up on pricing by Friday",
    "Schedule design review"
  ],
  "topics": ["roadmap", "pricing", "design"]
}

Python SDK

from leanvox import Leanvox

client = Leanvox(api_key="lv_your_key")

result = client.transcribe("meeting.mp3", diarization=True)
print(result.formatted_transcript)

# With summary
result = client.transcribe(
    "meeting.mp3",
    diarization=True,
    summary=True
)
print(result.summary["text"])

Node.js SDK

import Leanvox from 'leanvox';
import { createReadStream } from 'fs';

const client = new Leanvox({ apiKey: 'lv_your_key' });

const result = await client.audio.transcribe({
  file: createReadStream('meeting.mp3'),
});
console.log(result.formattedTranscript);

Pricing

Tier	Features	Price/min
Transcribe	Transcript + speaker diarization	$0.002
Transcribe + Summarize	+ AI summary, action items, topics	$0.005

Speaker diarization always included free. Free signup includes 200+ minutes of TTS and 500 minutes of STT. Minimum billing: 10 seconds.

Supported Formats

mp3, wav, ogg, flac, m4a, webm — up to 500MB. 99 languages via Whisper V3.

Voice-Over Workflow

Re-voice existing audio by transcribing with speaker diarization, then generating dialogue with custom voices. Perfect for podcast re-hosting, dubbing, or replacing AI-generated audio with different voices.

Step 1: Transcribe with Diarization

Extract the transcript with speaker labels. Diarization is always free — it identifies who said what.

# Python
result = client.transcribe("podcast.mp3", diarization=True)

# transcript.segments contains timestamped lines with speaker IDs
segments = result.segments

Step 2: Map Speakers to Voices

Assign a voice to each speaker. Use any standard voice, Pro curated voice, or your own cloned/designed voices.

# Python
speaker_map = {
    "SPEAKER_00": "podcast_conversational_female",      # Host → Chat Host (Pro)
    "SPEAKER_01": "podcast_casual_male",     # Guest → Casual Host (Pro)
}

# Build dialogue lines
lines = []
for seg in result.segments:
    voice = speaker_map.get(seg.speaker, "af_heart")  # fallback
    lines.append({
        "text": seg.text,
        "voice": voice,
    })

Step 3: Generate Dialogue

Pass the lines to the dialogue endpoint. It stitches multi-speaker audio into a single file with natural pauses.

# Python
audio = client.dialogue(
    model="pro",
    lines=lines,
    gap_ms=400,  # pause between speakers
)
audio.save("revoiced_podcast.mp3")

Full Example (Python)

from leanvox import Leanvox

client = Leanvox()

# 1. Transcribe with diarization
result = client.transcribe("original_podcast.mp3", diarization=True)

# 2. Map speakers to new voices
speaker_map = {
    "SPEAKER_00": "podcast_conversational_female",
    "SPEAKER_01": "podcast_casual_male",
}

lines = [
    {"text": seg.text, "voice": speaker_map.get(seg.speaker, "af_heart")}
    for seg in result.segments
]

# 3. Generate dialogue with new voices
audio = client.dialogue(model="pro", lines=lines, gap_ms=400)
audio.save("revoiced.mp3")

print(f"✓ Revoiced {len(lines)} lines across {len(set(seg.speaker for seg in result.segments))} speakers")

CLI Workflow

1. Transcribe with JSON output

lvox transcribe podcast.mp3 --json > transcript.json

2. Process transcript and build dialogue JSON

# Extract segments, map speakers, write dialogue.json
jq '.transcript.segments | map({text: .text, voice: (if .speaker == "SPEAKER_00" then "podcast_conversational_female" else "podcast_casual_male" end)})' transcript.json > dialogue.json

3. Generate dialogue audio

lvox dialogue --file dialogue.json --model pro --gap-ms 400 -o revoiced.mp3

Use Cases

Use Case	Why
Podcast dubbing	Re-host with different voices or languages
Voice replacement	Swap AI voices without re-recording
Anonymization	Replace real voices with synthetic ones
Content adaptation	Localize or re-brand existing audio

💡 Tip: Use num_speakers parameter in transcribe to improve diarization accuracy if you know the exact speaker count.

MCP Server

Use Leanvox directly from Claude, ChatGPT, Cursor, VS Code, and any MCP-compatible AI assistant. Zero code required.

Install

npx @leanvox/mcp-server

Or install globally: npm install -g @leanvox/mcp-server

Claude Desktop

~/.claude/claude_desktop_config.json

{
  "mcpServers": {
    "leanvox": {
      "command": "npx",
      "args": ["-y", "@leanvox/mcp-server"],
      "env": {
        "LEANVOX_API_KEY": "lv_live_your_key_here"
      }
    }
  }
}

Cursor

Settings → MCP → Add Server

{
  "leanvox": {
    "command": "npx",
    "args": ["-y", "@leanvox/mcp-server"],
    "env": {
      "LEANVOX_API_KEY": "lv_live_your_key_here"
    }
  }
}

VS Code (Copilot)

.vscode/mcp.json

{
  "servers": {
    "leanvox": {
      "command": "npx",
      "args": ["-y", "@leanvox/mcp-server"],
      "env": {
        "LEANVOX_API_KEY": "lv_live_your_key_here"
      }
    }
  }
}

Available Tools

Tool	Description
leanvox_generate	Generate speech from text
leanvox_stream	Stream audio to a file
leanvox_dialogue	Create multi-speaker dialogue
leanvox_list_voices	Browse available voices
leanvox_check_balance	Check account balance

Resources

URI	Description
`leanvox://voices`	All available voices
`leanvox://voices/curated`	40 curated Pro voices (11 categories)
`leanvox://generations`	Past generations
`leanvox://account`	Balance & usage

Prompts

Prompt	Description
narrate	Convert text to natural speech
podcast	Create a multi-speaker podcast
voice-clone	Clone a voice from reference audio

npm · GitHub

Tool Definitions

JSON schemas for LLM function calling. Drop these into any agent framework — OpenAI, Anthropic, Google, or custom — and let your LLM call Leanvox directly.

OpenAI Format

{
  "type": "function",
  "function": {
    "name": "leanvox_generate",
    "description": "Generate speech audio from text using Leanvox TTS API. Returns a URL to the generated audio file.",
    "strict": true,
    "parameters": {
      "type": "object",
      "properties": {
        "text": {
          "type": "string",
          "description": "Text to convert to speech (max 10,000 Unicode characters)"
        },
        "model": {
          "type": "string",
          "enum": ["standard", "pro"],
          "description": "TTS model. 'standard' for fast/cheap, 'pro' for highest quality with emotion control"
        },
        "voice": {
          "type": "string",
          "description": "Voice ID (e.g. 'af_heart', 'podcast_conversational_female', 'podcast_casual_male'). Use leanvox_list_voices to see options."
        },
        "language": {
          "type": "string",
          "description": "ISO 639-1 language code (default: 'en')"
        },
        "speed": {
          "type": "number",
          "description": "Playback speed 0.5-2.0 (default: 1.0)"
        }
      },
      "required": ["text"],
      "additionalProperties": false
    }
  }
}

Anthropic Format

{
  "name": "leanvox_generate",
  "description": "Generate speech audio from text using Leanvox TTS API. Returns a URL to the generated audio file.",
  "input_schema": {
    "type": "object",
    "properties": {
      "text": {
        "type": "string",
        "description": "Text to convert to speech (max 10,000 Unicode characters)"
      },
      "model": {
        "type": "string",
        "enum": ["standard", "pro"],
        "description": "TTS model. 'standard' for fast/cheap, 'pro' for highest quality with emotion control"
      },
      "voice": {
        "type": "string",
        "description": "Voice ID (e.g. 'af_heart', 'podcast_conversational_female', 'podcast_casual_male'). Use leanvox_list_voices to see options."
      },
      "language": {
        "type": "string",
        "description": "ISO 639-1 language code (default: 'en')"
      },
      "speed": {
        "type": "number",
        "description": "Playback speed 0.5-2.0 (default: 1.0)"
      }
    },
    "required": ["text"]
  }
}

Full tool definition files for leanvox_generate, leanvox_stream, leanvox_dialogue, leanvox_list_voices, and leanvox_check_balance are available on GitHub. Also served at:

https://api.leanvox.com/.well-known/ai-tools.json

GitHub

Integrations

n8n

Use LeanVox in your n8n workflows with our community node. Generate speech, transcribe audio, create multi-speaker dialogues, and manage voices — all from n8n.

Install

In n8n, go to Settings → Community Nodes → Install and enter:

n8n-nodes-leanvox

Available Operations

Generate Speech — text → audio (standard, pro, or max model)
Generate Speech (Async) — long text → async job → audio
Check Job — poll async job status
Dialogue — multi-speaker script → combined audio
Transcribe — audio → text with optional diarization & summary
List Voices — browse all available voices
List Curated Voices — curated voices with audio previews
Check Balance — view credit balance

Setup

Install the community node (see above)
Add a LeanVox API credential with your API key
Drag the LeanVox node into any workflow
Select a resource and operation

Example: Blog to Podcast Workflow

RSS Feed → Extract Article Text → LeanVox (Generate Speech) → Upload to S3

Source & docs: GitHub · npm