General transcription flow

Getting your transcription using Gladia’s API is pretty straight forward and can be splitted into 3 different steps:


Upload your file

This step is optionnal if you are already working with audio URLs.

If you’re working with audio or video files, you’ll need to upload it first using our /upload endpoint with multipart/form-data content-type since Gladia /v2/transcription endpoint only accept audio URLs.

If you are already using audio file URLs, proceed to the next step.

curl --request POST \
  --url \
  --header 'Content-Type: multipart/form-data' \
  --header 'x-gladia-key: YOUR_GLADIA_API_TOKEN' \
  --form audio=@/path/to/your/audio/conversation.wav

Example response :

  "audio_url": "",
  "audio_metadata": {
    "id": "636c70f6-92c1-4026-a8b6-0dfe3ecf826f",
    "filename": "conversation.wav",
    "extension": "wav",
    "size": 99515383,
    "audio_duration": 4146.468542,
    "number_of_channels": 2

We will now proceed to the next steps using the returned audio_url.



We’ll now POST the transcription request to Gladia’s API using the /v2/transcription endpoint.

/v2/transcription only accept application/json as Content-Type.

  curl --request POST \
    --url \
    --header 'Content-Type: application/json' \
    --header 'x-gladia-key: YOUR_GLADIA_API_TOKEN' \
    --data '{
    "audio_url": "YOUR_AUDIO_URL",
    "diarization": true,
    "diarization_config": {
      "number_of_speakers": 3,
      "min_speakers": 1,
      "max_speakers": 5
    "translation": true,
    "translation_config": {
      "model": "base",
      "target_languages": ["fr", "en"]
    "subtitles": true,
    "subtitles_config": {
      "formats": ["srt", "vtt"]
    "detect_language": true,
    "enable_code_switching": false

You’ll get an instant response from the request with and id and a result_url. The id is your transcription ID that you will use to get your transcription result once it’s done.

result_url is returned for convenience. This is a pre-built url with your transcription id in it that you can use to get your result in the next step.


Get the transcription result

You can get your transcription results in 3 different ways: