POST
/
v2
/
live

Authorizations

x-gladia-key
string
headerrequired

Your personal Gladia API key

Body

application/json
encoding
enum<string>
default: wav/pcm

The encoding format of the audio stream. Supported formats:

  • PCM: 8, 16, 24, and 32 bits
  • A-law: 8 bits
  • μ-law: 8 bits

Note: No need to add WAV headers to raw audio as the API supports both formats.

Available options:
wav/pcm,
wav/alaw,
wav/ulaw
bit_depth
enum<number>
default: 16

The bit depth of the audio stream

Available options:
8,
16,
24,
32
sample_rate
enum<number>
default: 16000

The sample rate of the audio stream

Available options:
8000,
16000,
32000,
44100,
48000
channels
integer
default: 1

The number of channels of the audio stream

Required range: 1 < x < 8
custom_metadata
object

Custom metadata you can attach to this live transcription

endpointing
number
default: 0.3

The endpointing duration in seconds. Endpointing is the duration of silence which will cause an utterance to be considered as finished

Required range: 0.01 < x < 10
maximum_duration_without_endpointing
number
default: 30

The maximum duration in seconds without endpointing. If endpointing is not detected after this duration, current utterance will be considered as finished

Required range: 5 < x < 60
language_config
object

Specify the language configuration

pre_processing
object

Specify the pre-processing configuration

realtime_processing
object

Specify the realtime processing configuration

post_processing
object

Specify the post-processing configuration

messages_config
object

Specify the websocket messages configuration

callback
boolean
default: false

If true, messages will be sent to configured url.

callback_config
object

Specify the callback configuration

Response

201 - application/json
id
string
required

Id of the job

url
string
required

The websocket url to connect to for sending audio data. The url will contain the temporary token to authenticate the session.