Audio Transcription:

   Args:

  •       model (str): The AI model used for audio trancription (default: large-v2).

  •       audio (audio): The file to transcript.

  •       audio_url (url): The file url to transcript, ignored if an audio file is provided. This can be a public file url or any of the supported social platform listed in the documentation..

  •       language_behaviour (enum): Define how the speaker's language will be detected (default: automatic single language).

  •       language (enum): If language_behaviour is set to manual, define the language to use for the transcription.

  •       toggle_noise_reduction (boolean): Activate the noise reduction to improve transcription quality (default: False).

  •       transcription_hint (string): String to be fed to Whisper Model as textual context used during the inference. If empty, this argument is skipped. (default: ).

  •       toggle_diarization (boolean): Activate the diarization of the audio (default: False).

  •       toggle_direct_translate (boolean): Activate the direct translation of the audio transcription (default: False).

  •       target_translation_language (enum): If toogle_direct_translate is set to true, define the language to use for the translation of the transcription.

  •       toggle_text_emotion_recognition (boolean): Activate the emotion recognition of the audio transcription (default: False).

  •       toggle_summarization (boolean): Activate the summarization of the audio transcription (default: False).

  •       toggle_chapterization (boolean): Activate the chapterization of the audio transcription (default: False).

  •       webhook_url (string): Webhook URL to send the result to. Make sure it network is open. (default: False).

  •       diarization_max_speakers (integer): Guiding maximum number of speakers (10 at most) (default: 2).

  •       output_format (enum): Define the output format, allowing to have Plain Text, Text, JSON, STR, VTT (subtitle file format for video content) (default: json).

   Returns:

      Dict: transcription

Language
Authentication
Header
Click Try It! to start a request and see the response here!