Quickstart

Upload your file

This step is optional if you are already working with audio URLs.

If you’re working with audio or video files, you’ll need to upload it first using our POST /v2/upload endpoint with multipart/form-data content-type since the POST /v2/pre-recorded endpoint only accept audio URLs. If you are already using audio file URLs, proceed to the next step.

curl --request POST \
  --url https://api.gladia.io/v2/upload \
  --header 'Content-Type: multipart/form-data' \
  --header 'x-gladia-key: YOUR_GLADIA_API_KEY' \
  --form audio=@/path/to/your/audio/conversation.wav

Example response:

{
  "audio_url": "https://api.gladia.io/file/636c70f6-92c1-4026-a8b6-0dfe3ecf826f",
  "audio_metadata": {
    "id": "636c70f6-92c1-4026-a8b6-0dfe3ecf826f",
    "filename": "conversation.wav",
    "extension": "wav",
    "size": 99515383,
    "audio_duration": 4146.468542,
    "number_of_channels": 2
  }
}

We will now proceed to the next steps using the returned audio_url.

Transcribe

We’ll now POST the transcription request to Gladia’s API using the POST /v2/pre-recorded endpoint.

/v2/pre-recorded only accept application/json as Content-Type.

  curl --request POST \
    --url https://api.gladia.io/v2/pre-recorded \
    --header 'Content-Type: application/json' \
    --header 'x-gladia-key: YOUR_GLADIA_API_KEY' \
    --data '{
    "audio_url": "YOUR_AUDIO_URL",
    "language_config": {
      "languages": [],
      "code_switching": false
    },
    "diarization": true,
    "diarization_config": {
      "number_of_speakers": 3,
      "min_speakers": 1,
      "max_speakers": 5
    },
    "translation": true,
    "translation_config": {
      "model": "base",
      "target_languages": ["fr", "en"],
      "context_adaptation": true,
      "context": "Business meeting discussing quarterly results",
      "informal": false
    },
    "subtitles": true,
    "subtitles_config": {
      "formats": ["srt", "vtt"]
    }
    }
  '

You’ll get an instant response from the request with an id and a result_url. The id is your transcription ID that you will use to get your transcription result once it’s done. result_url is returned for convenience. This is a pre-built url with your transcription id in it that you can use to get your result in the next step.

Get the transcription result

You can get your transcription results in 3 different ways:

Polling

Once you post your transcription request, you get a transcription id and a pre-built result_url for convenience. To get the result with this method, you’ll just have to GET continuously on the given result_url until the status of your transcription is done.You can get more information on the different transcriptions status by checking directly the API Reference.

Webhook

You can configure webhooks at https://app.gladia.io/webhooks to be notified when your transcriptions are done.

Once a transcription is done, a POST request will be made to the endpoint you configured. The request body is a JSON object containing the transcription id that you can use to retrieve your result with our API.
For the full body definition, check our API definition.

Callback URL

Callback are HTTP calls that you can use to get notified when your transcripts are ready.Instead of polling and keeping your server busy and maintaining work, you can use the callback feature to receive the result to a specified endpoint:

{
  "audio_url": "YOUR_AUDIO_URL",
  "callback": true,
  "callback_config": {
    "url": "https://yourserverurl.com/your/callback/endpoint/",
    "method": "POST"
  }
}

Once the transcription is done, a request will be made to the url you provided in callback_config.url using the HTTP method you provided in callback_config.method. Allowed methods are POST and PUT with the default being POST.The request body is a JSON object containing the transcription id and an event property that tells you if it’s a success or an error.

Want to know more about a specific feature? Check out our Features chapter for more details.

Full code sample

You can find complete code samples in our Github repository:

Introduction

Speech-to-Text

Integrations

Language

Audio Intelligence

Limits & Specifications

Migrations

Upload your file

Transcribe

Get the transcription result

Full code sample

Introduction

Speech-to-Text

Integrations

Language

Audio Intelligence

Limits & Specifications

Migrations

​Upload your file

​Transcribe

​Get the transcription result

​Full code sample

Upload your file

Transcribe

Get the transcription result

Full code sample