> ## Documentation Index
> Fetch the complete documentation index at: https://docs.gladia.io/llms.txt
> Use this file to discover all available pages before exploring further.

# Recommended Parameters by Use Case

> Best parameter configurations for pre-recorded transcription depending on your application (Meeting Recorders, Call Centers, Podcasts, Subtitles, Multilingual Content).

The right parameter configuration can significantly impact transcription quality for pre-recorded audio. This guide covers recommended starting points for common scenarios and highlights pitfalls that frequently trip up new integrations.

<Info>
  These recommendations apply to the **[Pre-recorded
  API](/chapters/pre-recorded-stt/quickstart)** and are passed in the `POST
      /v2/pre-recorded` request body. They are starting points — tune them to match
  your specific needs.
</Info>

***

## Language Configuration

One of the most common configuration mistakes is misunderstanding how `language_config` works. Choosing the right setup avoids unnecessary detection overhead and improves accuracy.

**When to set an explicit language:**

* You **know** the language of the audio ahead of time.
* The audio is **monolingual** (single language throughout).
* You want the **fastest, most accurate** results.

```json theme={"system"}
{
  "language_config": {
    "languages": ["en"],
    "code_switching": false
  }
}
```

**When to use auto-detection:**

* You process audio in **many different languages** and don't know which one beforehand.
* You want Gladia to pick the language automatically.

```json theme={"system"}
{
  "language_config": {
    "languages": [],
    "code_switching": false
  }
}
```

<Warning>
  When `code_switching` is `false` and no language is set, the language is
  detected on the **first utterance** and reused for the rest of the session or
  file. If the beginning of your audio contains silence, music, or a different
  language than the main content, this can lead to incorrect detection for the
  whole transcription.
</Warning>

<Tip>
  Even when using auto-detection, pass a **small list of likely languages** in
  `languages` to constrain the search. This improves both accuracy and
  processing time.
</Tip>

***

## Code Switching

Code switching (`language_config.code_switching: true`) lets Gladia detect and transcribe **multiple languages** within the same audio, re-evaluating the language on each utterance.

**When to enable it:**

* Speakers **switch languages** mid-conversation (e.g. bilingual meetings, multilingual customer support).
* You need the detected `language` returned **per utterance**.

**When NOT to enable it:**

* The audio is in a **single language** — code switching adds unnecessary processing and can introduce misdetections.
* You've set **exactly one language** in `languages` — in that case `code_switching` is ignored anyway.

```json theme={"system"}
{
  "language_config": {
    "languages": ["en", "fr", "es"],
    "code_switching": true
  }
}
```

<Warning>
  **Do not enable `code_switching` with an empty `languages` list.** When no
  languages are specified, the language detector evaluates every utterance
  against 100+ supported languages, which leads to frequent misdetections —
  especially between similar-sounding languages. Always provide a short list of
  languages you **actually expect** in the audio.
</Warning>

***

## Custom Vocabulary

[Custom vocabulary](/chapters/audio-intelligence/custom-vocabulary) is a post-transcription replacement based on **phoneme similarity**. It's essential for domain-specific terms that speech models frequently mis-transcribe.

**Best practices:**

* **Always provide both** the `custom_vocabulary` flag and a `custom_vocabulary_config`.
* **Add pronunciations** for words that can be said in different ways (accents, foreign speakers). This is more reliable than raising `intensity`.
* **Keep `intensity` moderate** (0.4-0.6). High values increase false positives where unrelated words get replaced.
* **Set `language`** on individual vocabulary entries when your audio is multilingual and a term is pronounced differently depending on the language.

<Tip>
  For simple terms that are already close to their phonetic spelling (e.g. brand
  names), you can pass them as plain strings instead of objects — Gladia will
  use the default intensity.
</Tip>

<CodeGroup>
  ```json Pre-recorded theme={"system"}
  {
    "audio_url": "YOUR_AUDIO_URL",
    "custom_vocabulary": true,
    "custom_vocabulary_config": {
      "vocabulary": [
        "Kubernetes",
        {
          "value": "Gladia",
          "pronunciations": ["Gladya", "Gladiah"],
          "intensity": 0.5
        },
        {
          "value": "PostgreSQL",
          "pronunciations": ["Postgres Q L", "Post gress"],
          "intensity": 0.4
        }
      ],
      "default_intensity": 0.5
    }
  }
  ```

  ```json Live theme={"system"}
  {
    "realtime_processing": {
      "custom_vocabulary": true,
      "custom_vocabulary_config": {
        "vocabulary": [
          "Kubernetes",
          {
            "value": "Gladia",
            "pronunciations": ["Gladya", "Gladiah"],
            "intensity": 0.5
          },
          {
            "value": "PostgreSQL",
            "pronunciations": ["Postgres Q L", "Post gress"],
            "intensity": 0.4
          }
        ],
        "default_intensity": 0.5
      }
    }
  }
  ```
</CodeGroup>

***

## Meeting Recorders

For apps that record and process meetings — team stand-ups, board sessions, 1-on-1s — the goal is to produce **structured, actionable meeting notes** with clear speaker attribution. Meetings typically have a known set of participants and benefit heavily from post-processing features like summarization.

| Parameter                                          | Recommended value           | Why                                                                                                                                                           |
| -------------------------------------------------- | --------------------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| `diarization`                                      | `true`                      | Attributes speech to each participant. See [Speaker diarization](/chapters/audio-intelligence/speaker-diarization).                                           |
| `diarization_config.min_speakers` / `max_speakers` | Set a range (e.g. `2`-`10`) | Meeting size varies — a range lets the model adapt without over- or under-splitting speakers.                                                                 |
| `summarization`                                    | `true`                      | Generates a summary for quick review. Use `bullet_points` type for action-item style output. See [Summarization](/chapters/audio-intelligence/summarization). |
| `named_entity_recognition`                         | `true`                      | Surfaces people, organizations, dates, and other key entities mentioned during the meeting. See [NER](/chapters/audio-intelligence/named-entity-recognition). |
| `sentences`                                        | `true`                      | Produces well-segmented, readable output suitable for meeting minutes. See [Sentences](/chapters/pre-recorded-stt/features/sentences).                        |
| `language_config.languages`                        | Set explicitly              | Meeting language is almost always known in advance — setting it avoids detection overhead.                                                                    |
| `custom_vocabulary`                                | `true`                      | Add company-specific terms, project names, and participant names for better accuracy.                                                                         |

<Info>
  **Diarization vs. multi-channel:** if each speaker is on a **separate audio channel** (e.g. a, use the `channel` field on each utterance to identify who is speaking — diarization is not needed. See [Multiple channels](/chapters/limits-and-specifications/multiple-channels)

  If all speakers share a **single audio channel**, enable `diarization` to separate the speakers. See [Speaker diarization](/chapters/audio-intelligence/speaker-diarization).
</Info>

***

## Call Centers

For recorded phone calls the priorities are **speaker identification** and **accurate transcription** despite variable audio quality (telephony codecs, background noise, cross-talk).

| Parameter                               | Recommended value              | Why                                                                                                                         |
| --------------------------------------- | ------------------------------ | --------------------------------------------------------------------------------------------------------------------------- |
| `language_config.languages`             | Set explicitly (e.g. `["en"]`) | Call center audio typically has a known language. Setting it avoids detection errors on noisy recordings.                   |
| `diarization`                           | `true`                         | Separates agent and customer speech. See [Speaker diarization](/chapters/audio-intelligence/speaker-diarization).           |
| `diarization_config.number_of_speakers` | `2`                            | Most calls have exactly two participants — giving this hint improves speaker assignment accuracy.                           |
| `custom_vocabulary`                     | `true`                         | Add product names, plan names, and internal terminology.                                                                    |
| `summarization`                         | `true`                         | Automatically generates a summary for agent wrap-up notes. See [Summarization](/chapters/audio-intelligence/summarization). |

<Info>
  **Diarization vs. multi-channel:** if each speaker is on a **separate audio channel** (e.g. a, use the `channel` field on each utterance to identify who is speaking — diarization is not needed. See [Multiple channels](/chapters/limits-and-specifications/multiple-channels)

  If all speakers share a **single audio channel**, enable `diarization` to separate the speakers. See [Speaker diarization](/chapters/audio-intelligence/speaker-diarization).
</Info>

***

## Podcasts & Interviews

For long-form audio with multiple speakers the focus is on **readability** and **correct speaker attribution**. Transcripts are often repurposed as articles or show notes, so segment quality matters.

| Parameter                                          | Recommended value          | Why                                                                                                                               |
| -------------------------------------------------- | -------------------------- | --------------------------------------------------------------------------------------------------------------------------------- |
| `diarization`                                      | `true`                     | Essential for multi-speaker content.                                                                                              |
| `diarization_config.min_speakers` / `max_speakers` | Set a range (e.g. `2`-`4`) | Provides a flexible hint when the exact count varies across episodes.                                                             |
| `sentences`                                        | `true`                     | Produces well-segmented, readable output suitable for publishing. See [Sentences](/chapters/pre-recorded-stt/features/sentences). |
| `custom_vocabulary`                                | `true`                     | Add recurring guest names, show-specific terms, and brand names.                                                                  |
| `language_config.languages`                        | Set explicitly             | Podcast language is almost always known in advance.                                                                               |

***

## Subtitles & Captioning

When generating subtitle files from pre-recorded content, tune the formatting parameters for the best viewing experience. Gladia produces SRT and VTT files directly — no post-processing needed. See [Subtitles](/chapters/audio-intelligence/subtitles) for the full parameter reference.

| Parameter                                     | Recommended value  | Why                                                                                                                                        |
| --------------------------------------------- | ------------------ | ------------------------------------------------------------------------------------------------------------------------------------------ |
| `subtitles`                                   | `true`             | Enables subtitle generation.                                                                                                               |
| `subtitles_config.formats`                    | `["srt", "vtt"]`   | Generate both formats to cover different players and platforms.                                                                            |
| `subtitles_config.maximum_characters_per_row` | `42`               | Standard broadcast limit for readability.                                                                                                  |
| `subtitles_config.maximum_rows_per_caption`   | `2`                | Keeps captions compact on screen.                                                                                                          |
| `subtitles_config.style`                      | `"compliance"`     | Uses stricter formatting rules suited for broadcast or accessibility requirements.                                                         |
| `translation`                                 | `true` (if needed) | When enabled, subtitles are automatically generated for each target language. See [Translation](/chapters/audio-intelligence/translation). |

<Tip>
  For live captions streamed in real time, use the [Realtime
  API](/chapters/live-stt/quickstart) with partial transcripts instead — see the
  [Live recommended
  parameters](/chapters/live-stt/recommended-parameters#subtitles--captioning)
  guide.
</Tip>

***

## Multilingual Content

For content with mixed languages — conferences, multilingual media, interviews with speakers from different countries — combine language detection with code switching.

| Parameter                        | Recommended value                                      | Why                                                                                                 |
| -------------------------------- | ------------------------------------------------------ | --------------------------------------------------------------------------------------------------- |
| `language_config.languages`      | List of expected languages (e.g. `["en", "fr", "de"]`) | Constrain to 3-5 expected languages for best accuracy.                                              |
| `language_config.code_switching` | `true`                                                 | Detects language shifts across utterances. See [Code switching](/chapters/language/code-switching). |
| `custom_vocabulary`              | `true`                                                 | Add terms for each language with appropriate `language` tags on each entry.                         |

<Warning>
  Do not enable `code_switching` with an empty `languages` list. The detector
  would evaluate every utterance against 100+ languages, leading to frequent
  misdetections — especially between similar-sounding languages.
</Warning>
