Quickstart

Using our SDKs
Using the API

The SDK simplifies real-time speech-to-text integration by abstracting the underlying API. Designed for developers, it offers:

Effortless implementation with minimal code to write.
Built-in resilience with automatic error handling (e.g., reconnection on network drops) ensures uninterrupted transcription. No need to manually manage retries or state recovery.

Install the SDK

npm install @gladiaio/sdk

import { GladiaClient } from "@gladiaio/sdk";

Initiate your real-time session

First, call the endpoint and pass your configuration. It’s important to correctly define the properties encoding, sample_rate, bit_depth and channels as we need them to parse your audio chunks.

const gladiaClient = new GladiaClient({
  apiKey: <YOUR_GLADIA_API_KEY>,
});

const gladiaConfig = {
  model: "solaria-1",
  encoding: 'wav/pcm',
  sample_rate: 16000,
  bit_depth: 16,
  channels: 1,
  language_config: {
    languages: ["fr"],
    code_switching: false,
  },
};

const liveSession = gladiaClient.liveV2().startSession(gladiaConfig);

Why initiate with POST instead of connecting directly to the WebSocket?

Security: Generate the WebSocket URL on your backend and keep your API key private. The init call returns a connectable URL and a session id that you can safely pass to web, iOS, or Android clients without exposing credentials in the app.
Lower infrastructure load: The secure URL is generated on your backend, the client can connect directly to Gladia’s WebSocket server without a pass-through on your side, saving your own resources.
Resilient reconnection and session continuity: If the WebSocket disconnects (which can happen on unreliable networks), the session created by the init call lets the client reconnect without losing context. Traditional flows that open a socket first typically force a brand‑new session on disconnect, dropping in‑progress state.

Connect to the WebSocket

Now that you’ve initiated the session and have the URL, you can connect to the WebSocket using your preferred language/framework. Here’s an example in JavaScript:

liveSession.on("message", (message) => {
  // Handle messages from the API
});
liveSession.on("started", (message) => {
  // Handle start session message
});
liveSession.on("ended", (message) => {
  // Handle end session message
});
liveSession.on("error", (message) => {
  // Handle error message
});

Send audio chunks

You can now start sending us your audio chunks through the WebSocket:

liveSession.sendAudio(audioChunk)

Read messages

During the whole session, we will send various messages through the WebSocket, the callback URL or webhooks. You can specify which kind of messages you want to receive in the initial configuration. See messages_config for WebSocket messages and callback_config for callback messages.Here’s an example of how to read a transcript message received through a WebSocket:

liveSession.on("message", (message) => {
  if (message.type === 'transcript' && message.data.is_final) {
    console.log(`${message.data.id}: ${message.data.utterance.text}`)
});

Need low-latency partial results?Enable partial transcripts by setting messages_config.receive_partial_transcripts: true.Use the is_final property to distinguish between partial and final transcript messages.

Stop the recording

Once you’re done, send us the stop_recording message. We will process remaining audio chunks and start the post-processing phase, in which we put together the final audio file and results with the add-ons you requested.You’ll receive a message at every step of the process in the WebSocket, or in the callback if configured. Once the post-processing is done, the WebSocket is closed with a code 1000.

liveSession.stopRecording()

Get the final results

If you want to get the complete result, you can call the GET /v2/live/:id endpoint with the id you received from the initial request.

  const response = await fetch(`https://api.gladia.io/v2/live/${sessionId}`, {
    method: 'GET',
    headers: {
      'x-gladia-key': '<YOUR_GLADIA_API_KEY>',
    },
  });
  if (!response.ok) {
    // Look at the error message
    // It might be a configuration issue
    console.error(`${response.status}: ${(await response.text()) || response.statusText}`)
    return;
  }

  const result = await response.json();
  console.log(result)

Want to know more about a specific feature? Check out our Features chapter for more details.

Full code sample

You can find complete code samples in our Github repository:

Introduction

Speech-to-Text

Integrations

Language

Audio Intelligence

Limits & Specifications

Migrations

Install the SDK

Initiate your real-time session

Connect to the WebSocket

Send audio chunks

Read messages

Stop the recording

Get the final results

Full code sample

Introduction

Speech-to-Text

Integrations

Language

Audio Intelligence

Limits & Specifications

Migrations

​Install the SDK

​Initiate your real-time session

​Connect to the WebSocket

​Send audio chunks

​Read messages

​Stop the recording

​Get the final results

​Full code sample

Install the SDK

Initiate your real-time session

Connect to the WebSocket

Send audio chunks

Read messages

Stop the recording

Get the final results

Full code sample