Enabling diarization
Diarization is enabled by sending thediarization
parameter in the transcription request:
Enabling enhanced diarization
For improved diarization handling edge cases and challenging audio, you can enable enhanced diarization by using theenhanced
parameter in the diarization_config
object.
Response
When diarization is enabled, each utterance will contain aspeaker
field, whose value is an index representing the speaker.
Speakers will be assigned indexes by order of appearance (i.e. the 1st speaker will be speaker 0, the 2nd speaker 1, etc).
Improving diarization accuracy
The following parameters are not yet supported by enhanced diarization.
diarization_config.num_of_speakers
, diarization_config.min_speakers
and diarization_config.max_speakers
parameters respectively.
Important: These parameters are hints, not hard constraints. The actual number of speakers detected by the model may not comply with the provided parameters.
Key | Type | Description |
---|---|---|
diarization_config.number_of_speakers | number | Guiding number of speakers - instructs the model to detect an exact number of speakers in the audio. |
diarization_config.min_speakers | number | Instructs the model to detect no less than this number of speakers in the audio. |
diarization_config.max_speakers | number | Causes the model to detect no more than this number of speakers in the audio. |