Setup your account
Signing Up
You will first need to create your account. Sign-up to app.gladia.io. You can sign-up through Google and more sign-up methods will be available in the near feature.Get your API key
Now that you signed up, login to app.gladia.io and go to the API keys section. We should have already created a default key for you. You can use this one or create your own.
Gladia offers 10 Hours of free audio transcription per month if you want to test the service!
Gladia’s playground
Using Gladia’s playground is a convenient way to test our Speech To Text API. On the playground you are able to transcribe remote audio files through URL, and also upload your local audio files, alongside with live audio transcription.1
Select your audio source
Choose your audio source (stream from you microphone, or upload a local file)
Then proceed to the next step.

2
Select features
You’ll be able to select Gladia API provide for your transcription.
For this example, we want to detect the named entity (like email adresses, phone numbers, etc.), so we turned on named entity recognition.
For this example, we want to detect the named entity (like email adresses, phone numbers, etc.), so we turned on named entity recognition.
Only a few features of Gladia API are available on the playground. For more advanced testing,
check our API documentation instead.

3
Talk to Gladia
You can talk to Gladia by clicking on the “Start transcribing” button, and you’ll be able to see the transcription of your voice in the “Transcription” tab.
Text in italic in the transcription represents partials transcripts.

4
Transcribe
You can see an already formatted and readable results in the default “Transcription” tab, and you’ll also find
the result in JSON format (the one you’d get with an API call).

Gladia’s APIs
Next steps
Now that you tested Gladia’s basic transcription, you might want to extract data, enhance, translate or format your audio transcriptions. On top of its Speech-to-text API, Gladia provides a whole set of tools that you might want to use for your particular use cases like:- Speaker recognition (diarization)
- Audio intelligence models (Translation, Summarization, Chapterization, Custom AI process…)
- Word level timestamps
- Subtitles generation
- Custom vocabulary & Context prompting