Skip to main content

Documentation Index

Fetch the complete documentation index at: https://docs.gladia.io/llms.txt

Use this file to discover all available pages before exploring further.

Pre-recorded Live As Speech-to-text models are trained on general vocabulary, under-represented words such as brand names, proper nouns, or domain-specific terms are often transcribed incorrectly. Custom Spelling is a post-processing operation that applies literal matching between the correct word and the pronunciations entries. When there is a literal match, the transcribed text is replaced with your term.
If the word comes out garbled or replaced by something phonetically similar (e.g. “le vin” instead of “Levain”), use Custom vocabulary instead. Custom vocabulary matches on phonemes, not literal text.

How it works

Gladia runs custom spelling on the transcript text after transcription:
  1. Gladia scans the output for strings listed in your dictionary values.
  2. When a variant is found, it is replaced with the corresponding key.
  3. Each entry supplies:
    • Key — the spelling to write (case-sensitive).
    • Values — variant strings to find (case-insensitive; can be multiple words).
Custom spelling is precise but strict: Gladia replaces only strings listed in your dictionary and leaves everything else unchanged.

Example: name “Gorish”

If the model outputs “gaurish” or “ghorish”, Gladia replaces them with “Gorish” when they appear in your dictionary:
"Gorish": ["ghorish", "gaurish", "gaureish", "geurish", "go rich"]
Custom spelling is not based on phoneme-matching but literal matching so make sure to list every spelling carefully as missing variants are never inferred.

When to use custom vocabulary vs. custom spelling

Use Custom spelling when the model outputs a recognizable but wrong form. It applies literal string matching on variants you list (e.g. “data-science”“Data Science”). List every close variant the model might output. Use Custom vocabulary when the model outputs garbled or sound-alike text. It applies phoneme-based matching on entries you define (e.g. “le vin” / “levine”“Levain”). Add pronunciations for each spelling the model might produce.
Custom spellingCustom vocabulary
Matches onExact text in the transcriptHow words sound
Best forWrong spelling, punctuation, formattingPhonetically similar mis-transcriptions
You provideAll the words that the model outputs wronglyvalue, pronunciations, intensity
Rule of thumb: start with a transcription run without any custom vocabulary. Look at what the output actually says. If the word appears but is just misspelled, custom spelling is the simpler and safer fix. If the word is completely garbled, that’s when custom vocabulary is the right tool.

Example configuration

{
  "custom_spelling": true,
  "custom_spelling_config": {
    "spelling_dictionary": {
      "Gorish": ["ghorish", "gaurish", "gaureish", "geurish", "go rich"],
      "Data Science": ["data-science", "data science"],
      ".": ["period", "full stop"],
      "SQL": ["sequel"]
    }
  }
}

Parameter reference

spelling_dictionary
object

Tuning tips

  • Collect variants from real transcripts — run without custom spelling first, then add keys and values from what the model actually outputs.
  • Match key capitalization to how the word should appear in the final transcript.
  • List phonetically different strings separately — custom spelling will not group them the way custom vocabulary does.
  • Move garbled or sound-alike output to custom vocabulary when listing every variant becomes impractical.
  1. Transcribe without custom spelling and note misspelled terms.
  2. Route each term: recognizable but wrong spelling → custom spelling; garbled or phonetically wrong → custom vocabulary.
  3. Build the dictionary — correct form as the key, every variant you have seen as values.
  4. Transcribe again — confirm replacements and check that nothing else was changed unexpectedly.
  5. Refine: add new variants as they appear in production audio.