API Documentation
Upload
Pre-recorded endpoints
Live endpoints
Live WS actions
Live WS messages
- Realtime messages
- Post-processing messages
- Acknowledgment messages
- Lifecycle messages
Live callbacks
- Realtime messages
- Post-processing messages
- Acknowledgment messages
- Lifecycle messages
Live webhooks
- Lifecycle messages
Get result
Get pre-recorded transcription’s status, parameters and result.
Authorizations
Your personal Gladia API key
Path Parameters
Id of the pre recorded job
Response
Creation date
Id of the job
pre-recorded
Debug id
"queued": the job has been queued. "processing": the job is being processed. "done": the job has been processed and the result is available. "error": an error occurred during the job's processing.
queued
, processing
, done
, error
API version
Completion date when status is "done" or "error"
Custom metadata given in the initial request
HTTP status code of the error if status is "error"
400 < x < 599
The file data you uploaded. Can be null if status is "error"
Duration of the audio file
The name of the uploaded file
The file id
Number of channels in the audio file
x > 1
The link used to download the file if audio_url was used
Parameters used for this pre-recorded transcription. Can be null if status is "error"
[Alpha] Enable audio to llm processing for this audio
[Alpha] Audio to llm configuration, if audio_to_llm
is enabled
The list of prompts applied on the audio transcription
Callback URL we will do a POST
request to with the result of the transcription
[Alpha] Enable chapterization for this audio
Specify the configuration for code switching
Specify the languages you want to use when detecting multiple languages
af
, sq
, am
, ar
, hy
, as
, az
, ba
, eu
, be
, bn
, bs
, br
, bg
, ca
, zh
, hr
, cs
, da
, nl
, en
, et
, fo
, fi
, fr
, gl
, ka
, de
, el
, gu
, ht
, ha
, haw
, he
, hi
, hu
, is
, id
, it
, ja
, jv
, kn
, kk
, km
, ko
, lo
, la
, lv
, ln
, lt
, lb
, mk
, mg
, ms
, ml
, mt
, mi
, mr
, mn
, mymr
, ne
, no
, nn
, oc
, ps
, fa
, pl
, pt
, pa
, ro
, ru
, sa
, sr
, sn
, sd
, si
, sk
, sl
, so
, es
, su
, sw
, sv
, tl
, tg
, ta
, tt
, te
, th
, bo
, tr
, tk
, uk
, ur
, uz
, vi
, cy
, yi
, yo
, jp
[Alpha] Context to feed the transcription model with for possible better accuracy
[Alpha] Enable custom spelling for this audio
[Alpha] Custom spelling configuration, if custom_spelling
is enabled
The list of spelling applied on the audio transcription
[Alpha] Specific vocabulary list to feed the transcription model with
Detect the language from the given audio
Enable speaker recognition (diarization) for this audio
Speaker recognition configuration, if diarization
is enabled
[Alpha] Use enhanced diarization for this audio
Maximum number of speakers in the audio
x > 0
Minimum number of speakers in the audio
x > 0
Exact number of speakers in the audio
x > 0
[Alpha] Allows to change the output display_mode for this audio. The output will be reordered, creating new utterances when speakers overlapped
Detect multiple languages in the given audio
Set the spoken language for the given audio (ISO 639 standard)
af
, sq
, am
, ar
, hy
, as
, az
, ba
, eu
, be
, bn
, bs
, br
, bg
, ca
, zh
, hr
, cs
, da
, nl
, en
, et
, fo
, fi
, fr
, gl
, ka
, de
, el
, gu
, ht
, ha
, haw
, he
, hi
, hu
, is
, id
, it
, ja
, jv
, kn
, kk
, km
, ko
, lo
, la
, lv
, ln
, lt
, lb
, mk
, mg
, ms
, ml
, mt
, mi
, mr
, mn
, mymr
, ne
, no
, nn
, oc
, ps
, fa
, pl
, pt
, pa
, ro
, ru
, sa
, sr
, sn
, sd
, si
, sk
, sl
, so
, es
, su
, sw
, sv
, tl
, tg
, ta
, tt
, te
, th
, bo
, tr
, tk
, uk
, ur
, uz
, vi
, cy
, yi
, yo
, jp
[Alpha] Enable moderation for this audio
[Alpha] Enable names consistency for this audio
[Alpha] Enable named entity recognition for this audio
[Alpha] Use enhanced punctuation for this audio
Enable sentences for this audio
[Alpha] Enable sentiment analysis for this audio
[Alpha] Enable structured data extraction for this audio
[Alpha] Structured data extraction configuration, if structured_data_extraction
is enabled
The list of classes to extract from the audio transcription
Enable subtitles generation for this transcription
Configuration for subtitles generation if subtitles
is enabled
srt
, vtt
Maximum number of characters per row in a subtitle
x > 1
Maximum duration of a subtitle in seconds
1 < x < 30
Maximum number of rows per caption
1 < x < 5
Minimum duration of a subtitle in seconds
x > 0
Style of the subtitles. Compliance mode refers to : https://loc.gov/preservation/digital/formats//fdd/fdd000569.shtml#:~:text=SRT%20files%20are%20basic%20text,alongside%2C%20example%3A%20%22MyVideo123
default
, compliance
[Beta] Enable summarization for this audio
[Beta] Summarization configuration, if summarization
is enabled
The type of summarization to apply
general
, bullet_points
, concise
[Beta] Enable translation for this audio
[Beta] Translation configuration, if translation
is enabled
af
, sq
, am
, ar
, hy
, as
, ast
, az
, ba
, eu
, be
, bn
, bs
, br
, bg
, my
, ca
, ceb
, zh
, hr
, cs
, da
, nl
, en
, et
, fo
, fi
, nl
, fr
, fy
, ff
, gd
, gl
, lg
, ka
, de
, el
, gu
, ht
, ha
, haw
, he
, hi
, hu
, is
, ig
, ilo
, id
, ga
, it
, ja
, jp
, jv
, kn
, kk
, km
, ko
, lo
, la
, lv
, ln
, lt
, lb
, mk
, mg
, ms
, ml
, mt
, mi
, mr
, mo
, mn
, mymr
, ne
, no
, nn
, oc
, or
, pa
, ps
, fa
, pl
, pt
, pa
, ro
, ru
, sa
, sr
, sn
, sd
, si
, sk
, sl
, so
, es
, su
, sw
, ss
, sv
, tl
, tg
, ta
, tt
, te
, th
, bo
, tn
, tr
, tk
, uk
, ur
, uz
, vi
, cy
, wo
, xh
, yi
, yo
, zu
Align translated utterances with the original ones
Model you want the translation model to use to translate
base
, enhanced
Pre-recorded transcription's result when status is "done"
Metadata for the given transcription & audio file
Duration of the transcribed audio file
Billed duration in seconds (audio_duration * number_of_distinct_channels)
Number of distinct channels in the transcribed audio file
x > 1
Duration of the transcription in seconds
If audio_to_llm
has been enabled, audio to llm results of the audio speech transcription
null
if success
is true
. Contains the error details of the failed model
Time audio intelligence model took to complete the task
The audio intelligence model returned an empty value
If audio_to_llm
has been enabled, results of the AI custom analysis
null
if success
is true
. Contains the error details of the failed model
Time audio intelligence model took to complete the task
The audio intelligence model returned an empty value
The audio intelligence model succeeded to get a valid output
The audio intelligence model succeeded to get a valid output
If chapterization
has been enabled, will generate chapters name for different parts of the given audio.
null
if success
is true
. Contains the error details of the failed model
Time audio intelligence model took to complete the task
The audio intelligence model returned an empty value
If chapterization
has been enabled, will generate chapters name for different parts of the given audio.
The audio intelligence model succeeded to get a valid output
If custom_spelling
has been enabled, Gladia will correct the spelling of the transcription
null
if success
is true
. Contains the error details of the failed model
Time audio intelligence model took to complete the task
The audio intelligence model returned an empty value
If custom_spelling
has been enabled, Gladia will correct the spelling of the transcription
The audio intelligence model succeeded to get a valid output
If display_mode
has been enabled, the output will be reordered, creating new utterances when speakers overlapped
null
if success
is true
. Contains the error details of the failed model
Time audio intelligence model took to complete the task
The audio intelligence model returned an empty value
If display_mode
has been enabled, proposes an alternative display output.
The audio intelligence model succeeded to get a valid output
If moderation
has been enabled, moderation of the audio speech transcription
null
if success
is true
. Contains the error details of the failed model
Time audio intelligence model took to complete the task
The audio intelligence model returned an empty value
If moderation
has been enabled, moderated transcription
The audio intelligence model succeeded to get a valid output
If name_consistency
has been enabled, Gladia will improve consistency of the names accross the transcription
null
if success
is true
. Contains the error details of the failed model
Time audio intelligence model took to complete the task
The audio intelligence model returned an empty value
If name_consistency
has been enabled, Gladia will improve the consistency of the names across the transcription
The audio intelligence model succeeded to get a valid output
If named_entity_recognition
has been enabled, the detected entities
If named_entity_recognition
has been enabled, the detected entities.
null
if success
is true
. Contains the error details of the failed model
Time audio intelligence model took to complete the task
The audio intelligence model returned an empty value
The audio intelligence model succeeded to get a valid output
If sentences
has been enabled, sentences of the audio speech transcription. Deprecated: content will move to the transcription
object.
null
if success
is true
. Contains the error details of the failed model
Time audio intelligence model took to complete the task
The audio intelligence model returned an empty value
If sentences
has been enabled, transcription as sentences.
The audio intelligence model succeeded to get a valid output
If sentiment_analysis
has been enabled, sentiment analysis of the audio speech transcription
null
if success
is true
. Contains the error details of the failed model
Time audio intelligence model took to complete the task
The audio intelligence model returned an empty value
If sentiment_analysis
has been enabled, Gladia will analyze the sentiments and emotions of the audio
The audio intelligence model succeeded to get a valid output
If speaker_reidentification
has been enabled, results of the AI speaker reidentification.
null
if success
is true
. Contains the error details of the failed model
Time audio intelligence model took to complete the task
The audio intelligence model returned an empty value
If speaker_reidentification
has been enabled, results of the AI speaker reidentification.
The audio intelligence model succeeded to get a valid output
If structured_data_extraction
has been enabled, structured data extraction results
null
if success
is true
. Contains the error details of the failed model
Time audio intelligence model took to complete the task
The audio intelligence model returned an empty value
If structured_data_extraction
has been enabled, results of the AI structured data extraction for the defined classes.
The audio intelligence model succeeded to get a valid output
If summarization
has been enabled, summarization of the audio speech transcription
null
if success
is true
. Contains the error details of the failed model
Time audio intelligence model took to complete the task
The audio intelligence model returned an empty value
If summarization
has been enabled, summary of the transcription
The audio intelligence model succeeded to get a valid output
Transcription of the audio speech
All transcription on text format without any other information
af
, sq
, am
, ar
, hy
, as
, ast
, az
, ba
, eu
, be
, bn
, bs
, br
, bg
, my
, ca
, ceb
, zh
, hr
, cs
, da
, nl
, en
, et
, fo
, fi
, fr
, fy
, ff
, gd
, gl
, lg
, ka
, de
, el
, gu
, ht
, ha
, haw
, he
, hi
, hu
, is
, ig
, ilo
, id
, ga
, it
, ja
, jv
, kn
, kk
, km
, ko
, lo
, la
, lv
, ln
, lt
, lb
, mk
, mg
, ms
, ml
, mt
, mi
, mr
, mo
, mn
, mymr
, ne
, no
, nn
, oc
, or
, pa
, ps
, fa
, pl
, pt
, pa
, ro
, ru
, sa
, sr
, sn
, sd
, si
, sk
, sl
, so
, es
, su
, sw
, ss
, sv
, tl
, tg
, ta
, tt
, te
, th
, bo
, tn
, tr
, tk
, uk
, ur
, uz
, vi
, cy
, wo
, xh
, yi
, yo
, zu
Transcribed speech utterances present in the audio
Audio channel of where this utterance has been transcribed from
x > 0
Confidence on the transcribed utterance (1 = 100% confident)
End timestamp in seconds of this utterance
All the detected languages in the audio sorted from the most detected to the less detected
af
, sq
, am
, ar
, hy
, as
, ast
, az
, ba
, eu
, be
, bn
, bs
, br
, bg
, my
, ca
, ceb
, zh
, hr
, cs
, da
, nl
, en
, et
, fo
, fi
, fr
, fy
, ff
, gd
, gl
, lg
, ka
, de
, el
, gu
, ht
, ha
, haw
, he
, hi
, hu
, is
, ig
, ilo
, id
, ga
, it
, ja
, jv
, kn
, kk
, km
, ko
, lo
, la
, lv
, ln
, lt
, lb
, mk
, mg
, ms
, ml
, mt
, mi
, mr
, mo
, mn
, mymr
, ne
, no
, nn
, oc
, or
, pa
, ps
, fa
, pl
, pt
, pa
, ro
, ru
, sa
, sr
, sn
, sd
, si
, sk
, sl
, so
, es
, su
, sw
, ss
, sv
, tl
, tg
, ta
, tt
, te
, th
, bo
, tn
, tr
, tk
, uk
, ur
, uz
, vi
, cy
, wo
, xh
, yi
, yo
, zu
Start timestamp in seconds of this utterance
Transcription for this utterance
List of words of the utterance, split by timestamp
Confidence on the transcribed word (1 = 100% confident)
End timestamps in seconds of the spoken word
Start timestamps in seconds of the spoken word
Spoken word
If diarization
enabled, speaker identification number
x > 0
If sentences
has been enabled, sentences results
null
if success
is true
. Contains the error details of the failed model
Time audio intelligence model took to complete the task
The audio intelligence model returned an empty value
If sentences
has been enabled, transcription as sentences.
The audio intelligence model succeeded to get a valid output
If subtitles
has been enabled, subtitles results
If translation
has been enabled, translation of the audio speech transcription
null
if success
is true
. Contains the error details of the failed model
Time audio intelligence model took to complete the task
The audio intelligence model returned an empty value
List of translated transcriptions, one for each target_languages
Contains the error details of the failed addon
All transcription on text format without any other information
af
, sq
, am
, ar
, hy
, as
, ast
, az
, ba
, eu
, be
, bn
, bs
, br
, bg
, my
, ca
, ceb
, zh
, hr
, cs
, da
, nl
, en
, et
, fo
, fi
, fr
, fy
, ff
, gd
, gl
, lg
, ka
, de
, el
, gu
, ht
, ha
, haw
, he
, hi
, hu
, is
, ig
, ilo
, id
, ga
, it
, ja
, jv
, kn
, kk
, km
, ko
, lo
, la
, lv
, ln
, lt
, lb
, mk
, mg
, ms
, ml
, mt
, mi
, mr
, mo
, mn
, mymr
, ne
, no
, nn
, oc
, or
, pa
, ps
, fa
, pl
, pt
, pa
, ro
, ru
, sa
, sr
, sn
, sd
, si
, sk
, sl
, so
, es
, su
, sw
, ss
, sv
, tl
, tg
, ta
, tt
, te
, th
, bo
, tn
, tr
, tk
, uk
, ur
, uz
, vi
, cy
, wo
, xh
, yi
, yo
, zu
Transcribed speech utterances present in the audio
Audio channel of where this utterance has been transcribed from
x > 0
Confidence on the transcribed utterance (1 = 100% confident)
End timestamp in seconds of this utterance
All the detected languages in the audio sorted from the most detected to the less detected
af
, sq
, am
, ar
, hy
, as
, ast
, az
, ba
, eu
, be
, bn
, bs
, br
, bg
, my
, ca
, ceb
, zh
, hr
, cs
, da
, nl
, en
, et
, fo
, fi
, fr
, fy
, ff
, gd
, gl
, lg
, ka
, de
, el
, gu
, ht
, ha
, haw
, he
, hi
, hu
, is
, ig
, ilo
, id
, ga
, it
, ja
, jv
, kn
, kk
, km
, ko
, lo
, la
, lv
, ln
, lt
, lb
, mk
, mg
, ms
, ml
, mt
, mi
, mr
, mo
, mn
, mymr
, ne
, no
, nn
, oc
, or
, pa
, ps
, fa
, pl
, pt
, pa
, ro
, ru
, sa
, sr
, sn
, sd
, si
, sk
, sl
, so
, es
, su
, sw
, ss
, sv
, tl
, tg
, ta
, tt
, te
, th
, bo
, tn
, tr
, tk
, uk
, ur
, uz
, vi
, cy
, wo
, xh
, yi
, yo
, zu
Start timestamp in seconds of this utterance
Transcription for this utterance
List of words of the utterance, split by timestamp
If diarization
enabled, speaker identification number
x > 0
If sentences
has been enabled, sentences results for this translation
null
if success
is true
. Contains the error details of the failed model
Time audio intelligence model took to complete the task
The audio intelligence model returned an empty value
If sentences
has been enabled, transcription as sentences.
The audio intelligence model succeeded to get a valid output
If subtitles
has been enabled, subtitles results for this translation
The audio intelligence model succeeded to get a valid output
Was this page helpful?