Getting started
Get started with Gladia Real-time Speech to Text (STT) API
Initiate your real-time session
First, call the POST /v2/live
endpoint and pass your configuration.
It’s important to correctly define the properties encoding
, sample_rate
, bit_depth
and channels
as we need them to parse your audio chunks.
You’ll receive a response with a WebSocket URL to connect to. If you loose connection, you can reconnect to that same URL and resume where you left off. Here’s an example of a response:
Connect to the WebSocket
Now that you’ve initiated the session and have the URL, you can connect to the WebSocket using your preferred language/framework. Here’s an example in JavaScript:
Send audio chunks
You can now start sending us your audio chunks through the WebSocket. You can send them directly as binary, or in JSON by encoding your chunk in base64, like this:
Read messages
During the whole session, we will send various messages through the WebSocket, the callback URL or webhooks. You can specify which kind of messages you want to receive in the initial configuration. See messages_config
for WebSocket messages and callback_config
for callback messages.
Here’s an example of how to read a transcript
message received through a WebSocket:
Stop the recording
Once you’re done, send us the stop_recording
message. We will process remaining audio chunks and start the post-processing phase, in which we put together the final audio file and results with the add-ons you requested.
You’ll receive a message at every step of the process in the WebSocket, or in the callback if configured. Once the post-processing is done, the WebSocket is closed with a code 1000.
Instead of sending the stop_recording
message, you can also close the WebSocket with the code 1000.
We will still do the post-processing in background and send you the messages through the callback you defined.
Get the final results
If you want to get the complete result, you can call the GET /v2/live/:id
endpoint with the id
you received from the initial request.
Full sample
You can find a complete sample in our Github repository:
Was this page helpful?