Translation
Translate your transcriptions & subtitles
This feature is on Beta state.
We’re looking for feedbacks to improve this feature, share yours here.
The Translation model generates translations of your transcriptions to one or more targeted languages. If subtitles and/or sentences are enabled, the translations will also include translated results for them. You can translate your transcription to multiples languages in a single API call.
The list of the languages covered by the Translation feature are listed in the API Reference (see translation_config
).
2 translation models are available:
base
: Fast, cover most use casesenhanced
: Slower, but higher quality and with context awareness
Usage
To enable translation simply set the "translation"
parameter to true
translation_config
Options
The translation
feature can be further customized using the translation_config
object. When translation: true
is set, you can also provide a translation_config
object to specify more details. Here are the available options:
target_languages
- Description: An array of strings specifying the language codes for the desired translation outputs.
- Example:
["fr", "es"]
for French and Spanish. - Details: The list of supported language codes can be found in the list of supported languages.
model
- Description: Specifies the translation model to be used.
- Values:
"base"
: Fast and covers most use cases."enhanced"
: Slower, but offers higher quality and context awareness.
- Default: If not specified, the system might use a default model (typically “base”, but refer to API docs for current defaults).
match_original_utterances
(Default: true
)
- Description: This boolean option controls whether the translated utterances should be aligned with the original utterances from the transcription.
- Default:
true
. - Behavior:
- When
true
, the system attempts to match the translated segments (utterances, sentences) to the timing and structure of the original detected speech segments. - When
false
, the translation might be more fluid or natural-sounding in the target language but could deviate from the original utterance segmentation.
- When
- Use Case: Keep as
true
for most subtitling or dubbing use cases where alignment with original speech is crucial. Set tofalse
if you prioritize a more natural flow in the translated text over strict temporal alignment.
lipsync
(Default: true
)
This option controls the behavior of the translation’s alignment with visual cues, specifically lip movements.
-
How it works: When
lipsync
is set totrue
(the default value), the translation process utilizes an advanced lip synchronization matching algorithm. This algorithm is designed to align the translated audio or subtitles with the speaker’s lip movements by leveraging timestamps derived from lip activity. -
Advantages: The primary benefit is an improved synchronization between the translated output and the visual of the speaker. This can significantly enhance the viewing experience, especially for dubbed content or when precise visual timing with speech is important.
-
Potential Trade-off: Due to its focus on matching lip movements, the algorithm might occasionally aggregate two distinct spoken words into a single “word” object within the translated output. This means that while the timing aligns well with the lips, the direct one-to-one correspondence between source words and translated words might sometimes be altered to achieve better visual sync.
-
When to disable: If a strict, word-for-word translation format is an absolute requirement, and minor deviations for the sake of lip synchronization are not acceptable, you should set
lipsync
tofalse
. This will instruct the system to prioritize literal word mapping over visual timing synchronization.
context_adaptation
(Default: true
)
- Description: Enables or disables context-aware translation features that allow the model to adapt translations based on provided context.
- Default:
true
. - Behavior:
- When
true
, the translation model can utilize contextual information and formality preferences to produce more accurate and appropriate translations. - When
false
, the translation will be performed without context adaptation, using only the source content for translation decisions.
- When
- Use Case: Keep as
true
for most use cases to benefit from enhanced translation quality. Set tofalse
only if you need purely literal translations without any contextual adjustments.
When context_adaptation
is enabled, you can use the following additional parameters:
context
(Default: ""
)
- Description: Provides additional context to improve translation quality and accuracy. This string parameter allows you to supply contextual information about the content being translated.
- Default:
""
(empty string). - Behavior: When provided, the translation model uses this context to better understand domain-specific terminology, proper nouns, or situational context that might influence the translation choices.
- Use Case: Particularly useful for:
- Technical content where specific terminology is important
- Content with ambiguous terms that could be translated differently based on context
- Providing background information about speakers, topics, or settings
- Example: For a medical conversation, you might set
context: "Medical consultation between doctor and patient discussing cardiology"
.
informal
(Default: false
)
- Description: Forces the translation to use informal language forms when available in the target language.
- Default:
false
. - Behavior:
- When
true
, the translation will use informal pronouns, verb conjugations, and speech patterns appropriate for casual conversation. - When
false
, the translation will default to formal or neutral language forms.
- When
- Use Case: Essential for:
- Casual conversations or social media content
- Content targeting younger audiences
- Maintaining the tone of informal source material
- Language Note: This parameter is particularly relevant for languages with formal/informal distinctions (e.g., French “tu/vous”, German “du/Sie”, Spanish “tú/usted”, ducth:“U/jij”).
Result
The transcription result will contain a "translation"
key with the output of the model:
If you enabled the subtitles
generation, those will also benefits from the translation model.