Skip to main content
POST
/
v2
/
transcription
Initiate a new transcription job
curl --request POST \
  --url https://api.gladia.io/v2/transcription \
  --header 'Content-Type: application/json' \
  --header 'x-gladia-key: <api-key>' \
  --data '{
  "context_prompt": "<string>",
  "custom_vocabulary": false,
  "custom_vocabulary_config": {
    "vocabulary": [
      "Westeros",
      {
        "value": "Stark"
      },
      {
        "value": "Night'\''s Watch",
        "pronunciations": [
          "Nightz Watch"
        ],
        "intensity": 0.4,
        "language": "en"
      }
    ],
    "default_intensity": 0.5
  },
  "detect_language": true,
  "enable_code_switching": false,
  "code_switching_config": {
    "languages": []
  },
  "language": "en",
  "callback_url": "http://callback.example",
  "callback": false,
  "callback_config": {
    "url": "http://callback.example",
    "method": "POST"
  },
  "subtitles": false,
  "subtitles_config": {
    "formats": [
      "srt"
    ],
    "minimum_duration": 1,
    "maximum_duration": 15.5,
    "maximum_characters_per_row": 2,
    "maximum_rows_per_caption": 3,
    "style": "default"
  },
  "diarization": false,
  "diarization_config": {
    "number_of_speakers": 3,
    "min_speakers": 1,
    "max_speakers": 2
  },
  "translation": false,
  "translation_config": {
    "target_languages": [
      "en"
    ],
    "model": "base",
    "match_original_utterances": true,
    "lipsync": true,
    "context_adaptation": true,
    "context": "<string>",
    "informal": false
  },
  "summarization": false,
  "summarization_config": {
    "type": "general"
  },
  "moderation": false,
  "named_entity_recognition": false,
  "chapterization": false,
  "name_consistency": false,
  "custom_spelling": false,
  "custom_spelling_config": {
    "spelling_dictionary": {
      "Gettleman": [
        "gettleman"
      ],
      "SQL": [
        "Sequel"
      ]
    }
  },
  "structured_data_extraction": false,
  "structured_data_extraction_config": {
    "classes": [
      "Persons",
      "Organizations"
    ]
  },
  "sentiment_analysis": false,
  "audio_to_llm": false,
  "audio_to_llm_config": {
    "prompts": [
      "Extract the key points from the transcription"
    ]
  },
  "custom_metadata": {
    "user": "John Doe"
  },
  "sentences": false,
  "display_mode": false,
  "punctuation_enhanced": false,
  "language_config": {
    "languages": [],
    "code_switching": false
  },
  "audio_url": "http://files.gladia.io/example/audio-transcription/split_infinity.wav"
}'
{
  "id": "45463597-20b7-4af7-b3b3-f5fb778203ab",
  "result_url": "https://api.gladia.io/v2/transcription/45463597-20b7-4af7-b3b3-f5fb778203ab"
}

Authorizations

x-gladia-key
string
header
required

Your personal Gladia API key

Body

application/json
audio_url
string<uri>
required

URL to a Gladia file or to an external audio or video file

Example:

"http://files.gladia.io/example/audio-transcription/split_infinity.wav"

context_prompt
string
deprecated

[Deprecated] Context to feed the transcription model with for possible better accuracy

custom_vocabulary
boolean
default:false

[Beta] Can be either boolean to enable custom_vocabulary for this audio or an array with specific vocabulary list to feed the transcription model with

custom_vocabulary_config
object

[Beta] Custom vocabulary configuration, if custom_vocabulary is enabled

detect_language
boolean
default:true
deprecated

[Deprecated] Use language_config instead. Detect the language from the given audio

enable_code_switching
boolean
default:false
deprecated

[Deprecated] Use language_config instead.Detect multiple languages in the given audio

code_switching_config
object
deprecated

[Deprecated] Use language_config instead. Specify the configuration for code switching

language
enum<string>
deprecated

[Deprecated] Use language_config instead. Set the spoken language for the given audio (ISO 639 standard) Specify the language in which it will be pronounced when sound comparison occurs. Default to transcription language.

Available options:
af,
am,
ar,
as,
az,
ba,
be,
bg,
bn,
bo,
br,
bs,
ca,
cs,
cy,
da,
de,
el,
en,
es,
et,
eu,
fa,
fi,
fo,
fr,
gl,
gu,
ha,
haw,
he,
hi,
hr,
ht,
hu,
hy,
id,
is,
it,
ja,
jw,
ka,
kk,
km,
kn,
ko,
la,
lb,
ln,
lo,
lt,
lv,
mg,
mi,
mk,
ml,
mn,
mr,
ms,
mt,
my,
ne,
nl,
nn,
no,
oc,
pa,
pl,
ps,
pt,
ro,
ru,
sa,
sd,
si,
sk,
sl,
sn,
so,
sq,
sr,
su,
sv,
sw,
ta,
te,
tg,
th,
tk,
tl,
tr,
tt,
uk,
ur,
uz,
vi,
yi,
yo,
zh
Example:

"en"

callback_url
string<uri>
deprecated

[Deprecated] Use callback/callback_config instead. Callback URL we will do a POST request to with the result of the transcription

Example:

"http://callback.example"

callback
boolean
default:false

Enable callback for this transcription. If true, the callback_config property will be used to customize the callback behaviour

callback_config
object

Customize the callback behaviour (url and http method)

subtitles
boolean
default:false

Enable subtitles generation for this transcription

subtitles_config
object

Configuration for subtitles generation if subtitles is enabled

diarization
boolean
default:false

Enable speaker recognition (diarization) for this audio

diarization_config
object

Speaker recognition configuration, if diarization is enabled

translation
boolean
default:false

[Beta] Enable translation for this audio

translation_config
object

[Beta] Translation configuration, if translation is enabled

summarization
boolean
default:false

[Beta] Enable summarization for this audio

summarization_config
object

[Beta] Summarization configuration, if summarization is enabled

moderation
boolean
default:false

[Alpha] Enable moderation for this audio

named_entity_recognition
boolean
default:false

[Alpha] Enable named entity recognition for this audio

chapterization
boolean
default:false

[Alpha] Enable chapterization for this audio

name_consistency
boolean
default:false

[Alpha] Enable names consistency for this audio

custom_spelling
boolean
default:false

[Alpha] Enable custom spelling for this audio

custom_spelling_config
object

[Alpha] Custom spelling configuration, if custom_spelling is enabled

structured_data_extraction
boolean
default:false

[Alpha] Enable structured data extraction for this audio

structured_data_extraction_config
object

[Alpha] Structured data extraction configuration, if structured_data_extraction is enabled

sentiment_analysis
boolean
default:false

Enable sentiment analysis for this audio

audio_to_llm
boolean
default:false

[Alpha] Enable audio to llm processing for this audio

audio_to_llm_config
object

[Alpha] Audio to llm configuration, if audio_to_llm is enabled

custom_metadata
object

Custom metadata you can attach to this transcription

Example:
{ "user": "John Doe" }
sentences
boolean
default:false

Enable sentences for this audio

display_mode
boolean
default:false

[Alpha] Allows to change the output display_mode for this audio. The output will be reordered, creating new utterances when speakers overlapped

punctuation_enhanced
boolean
default:false

[Alpha] Use enhanced punctuation for this audio

language_config
object

Specify the language configuration

Response

The transcription job has been initiated

id
string<uuid>
required

Id of the job

Example:

"45463597-20b7-4af7-b3b3-f5fb778203ab"

result_url
string<uri>
required

Prebuilt URL with your transcription id to fetch the result

Example:

"https://api.gladia.io/v2/transcription/45463597-20b7-4af7-b3b3-f5fb778203ab"