Local TTS model
OCTAVE 2
Second-gen emotion-aware speech-language model. Generates voice, style and personality from a text description alone - no reference audio required. Rich control over arousal, valence and speaking style. Research-first release.
GPU recommended
text-to-speech generation
7 languages
Hume Terms (research)
Quality
9.4/10
Speed
7.5/10
Model size
3.2 GB
Voices
Prompt-generated voices + style
Can OCTAVE 2 run locally?
OCTAVE 2 can generate speech locally for private voice workflows. Start with pip install hume.
Hume Terms (research) license. Review upstream restrictions before commercial use.
pip install hume
Upstream source
emotioncontrollablestreamingmultilingual
Audio profile
Best fit
OCTAVE 2 is best for multilingual local speech generation.
Hardware: gpuapple
Model details
Type
Local TTS model
Family
octave
Latency
low
Formats
pytorchapi
Languages
en, de, fr, es, it, pt, ja
Context
Describe a speaker in natural language
Install locally
01
Check runtimeConfirm the backend supports pytorch, api on your machine.02
Install modelUse the upstream command or repository instructions.03
Test locallyRun a short private audio prompt before moving into production workflows.pip install hume
Good for
- text-to-speech generation
- GPU recommended local workflows
- emotion, controllable, streaming
Watch before shipping
- Validate pronunciation, latency and artifacts with your own voice samples.
- Review the upstream license and acceptable-use notes.
- Benchmark on your target CPU, Apple Silicon or GPU setup.
Related TTS and speech models
OpenBMB
VoxCPM2
Local TTS model · Q 9.4 · Speed 8.3
Alibaba Cloud (Qwen Team)
Qwen3 TTS
Local TTS model · Q 9.5 · Speed 8.5
Zyphra
Zonos v0.1
Local TTS model · Q 9.5 · Speed 8.5
Alibaba FunAudioLLM
CosyVoice 2
Local TTS model · Q 9.3 · Speed 8.8
Bilibili
IndexTTS 2
Local TTS model · Q 9.4 · Speed 8
Boson AI
Higgs Audio v2
Local TTS model · Q 9.7 · Speed 7
StepFun
Step-Audio 2 Mini
Local TTS model · Q 9.3 · Speed 7.5
Hume AI
TADA
Local TTS model · Q 9.1 · Speed 7.5