Google's Gemini TTS converts text to realistic audio. 30 voice presets, multi-speaker synthesis (up to 10 speakers), 24+ languages, and inline style markers for expressive control.
$0.50/1M input tokens (flash), $10/1M output tokens
Sign in to run this model
Output will appear here
Google's Gemini TTS converts text to realistic audio. 30 voice presets, multi-speaker synthesis (up to 10 speakers), 24+ languages, and inline style markers for expressive control.
Get API access in minutes. No GPU setup, no infrastructure to manage.