> ## Documentation Index
> Fetch the complete documentation index at: https://docs.gladia.io/llms.txt
> Use this file to discover all available pages before exploring further.

# Initiate a transcription

> Initiate a pre-recorded transcription job. Use the returned `id` and the [GET /v2/pre-recorded/:id](/api-reference/v2/pre-recorded/get) endpoint to obtain the results.

<Tip>
  Pass `model` to choose the transcription model:

  **`"solaria-3"`** — our latest model: highest accuracy on European real-world audio.

  * **Async (pre-recorded) only** — not available for live transcription.
  * **Languages:** English, French, German, Spanish, Italian
  * **Single language only** — pass exactly one language in `language_config.languages` (no code switching).
  * All Audio Intelligence add-ons available.

  **`"solaria-1"`** — our generalist model: maximum language coverage across any domain.

  * Available for async and live.
  * Code switching and multi-language configuration (100+ languages covered)
  * All Audio Intelligence add-ons available.

  If omitted, the API uses the default model. (Solaria-1)
</Tip>


## OpenAPI

````yaml POST /v2/pre-recorded
openapi: 3.1.0
info:
  title: Gladia Control API
  description: ''
  version: '1.0'
  contact: {}
servers:
  - url: https://api.gladia.io/
    description: Gladia API production URL
security: []
tags: []
paths:
  /v2/pre-recorded:
    post:
      tags:
        - Pre-recorded V2
      summary: Initiate a new pre recorded job
      operationId: PreRecordedController_initPreRecordedJob_v2
      parameters: []
      requestBody:
        required: true
        content:
          application/json:
            schema:
              $ref: '#/components/schemas/InitTranscriptionRequest'
      responses:
        '201':
          description: The pre recorded job has been initiated
          content:
            application/json:
              schema:
                $ref: '#/components/schemas/InitPreRecordedTranscriptionResponse'
        '400':
          description: Something is wrong with the request
          content:
            application/json:
              schema:
                $ref: '#/components/schemas/BadRequestErrorResponse'
        '401':
          description: You don't have the permissions to initiate a new pre recorded job
          content:
            application/json:
              schema:
                $ref: '#/components/schemas/UnauthorizedErrorResponse'
        '422':
          description: The parameters you gave are incorrect
          content:
            application/json:
              schema:
                $ref: '#/components/schemas/UnprocessableEntityErrorResponse'
      security:
        - x_gladia_key: []
components:
  schemas:
    InitTranscriptionRequest:
      type: object
      properties:
        custom_vocabulary:
          type: boolean
          description: >-
            **[Beta]** Can be either boolean to enable custom_vocabulary for
            this audio or an array with specific vocabulary list to feed the
            transcription model with
          default: false
        custom_vocabulary_config:
          description: >-
            **[Beta]** Custom vocabulary configuration, if `custom_vocabulary`
            is enabled
          allOf:
            - $ref: '#/components/schemas/CustomVocabularyConfigDTO'
        callback_url:
          type: string
          description: >-
            **[Deprecated]** Use `callback`/`callback_config` instead. Callback
            URL we will do a `POST` request to with the result of the
            transcription
          example: https://callback.example
          format: uri
          deprecated: true
        callback:
          type: boolean
          description: >-
            Enable callback for this transcription. If true, the
            `callback_config` property will be used to customize the callback
            behaviour
          default: false
        callback_config:
          description: Customize the callback behaviour (url and http method)
          allOf:
            - $ref: '#/components/schemas/CallbackConfigDto'
        subtitles:
          type: boolean
          description: Enable subtitles generation for this transcription
          default: false
        subtitles_config:
          description: Configuration for subtitles generation if `subtitles` is enabled
          allOf:
            - $ref: '#/components/schemas/SubtitlesConfigDTO'
        diarization:
          type: boolean
          description: Enable speaker recognition (diarization) for this audio
          default: false
        diarization_config:
          description: Speaker recognition configuration, if `diarization` is enabled
          allOf:
            - $ref: '#/components/schemas/DiarizationConfigDTO'
        translation:
          type: boolean
          description: '**[Beta]** Enable translation for this audio'
          default: false
        translation_config:
          description: '**[Beta]** Translation configuration, if `translation` is enabled'
          allOf:
            - $ref: '#/components/schemas/TranslationConfigDTO'
        summarization:
          type: boolean
          description: Enable summarization for this audio
          default: false
        summarization_config:
          description: Summarization configuration, if `summarization` is enabled
          allOf:
            - $ref: '#/components/schemas/SummarizationConfigDTO'
        named_entity_recognition:
          type: boolean
          description: '**[Alpha]** Enable named entity recognition for this audio'
          default: false
        custom_spelling:
          type: boolean
          description: '**[Alpha]** Enable custom spelling for this audio'
          default: false
        custom_spelling_config:
          description: >-
            **[Alpha]** Custom spelling configuration, if `custom_spelling` is
            enabled
          allOf:
            - $ref: '#/components/schemas/CustomSpellingConfigDTO'
        sentiment_analysis:
          type: boolean
          description: Enable sentiment analysis for this audio
          default: false
        audio_to_llm:
          type: boolean
          description: Enable audio to LLM processing for this audio
          default: false
        audio_to_llm_config:
          description: Audio to LLM configuration, if `audio_to_llm` is enabled
          allOf:
            - $ref: '#/components/schemas/AudioToLlmListConfigDTO'
        pii_redaction:
          type: boolean
          description: Enable PII redaction for this audio
          default: false
        pii_redaction_config:
          description: PII redaction configuration, if `pii_redaction` is enabled
          allOf:
            - $ref: '#/components/schemas/PiiRedactionConfigDTO'
        custom_metadata:
          type: object
          description: Custom metadata you can attach to this transcription
          example:
            user: John Doe
          additionalProperties: true
        sentences:
          type: boolean
          description: Enable sentences for this audio
          default: false
        punctuation_enhanced:
          type: boolean
          description: '**[Alpha]** Use enhanced punctuation for this audio'
          default: false
        language_config:
          description: Specify the language configuration
          allOf:
            - $ref: '#/components/schemas/LanguageConfig'
        audio_url:
          type: string
          description: URL to a Gladia file or to an external audio or video file
          example: >-
            https://files.gladia.io/example/audio-transcription/split_infinity.wav
          format: uri
      required:
        - audio_url
    InitPreRecordedTranscriptionResponse:
      type: object
      properties:
        id:
          type: string
          description: Id of the job
          format: uuid
          example: 45463597-20b7-4af7-b3b3-f5fb778203ab
        result_url:
          type: string
          description: Prebuilt URL with your transcription `id` to fetch the result
          example: >-
            https://api.gladia.io/v2/transcription/45463597-20b7-4af7-b3b3-f5fb778203ab
          format: uri
      required:
        - id
        - result_url
    BadRequestErrorResponse:
      type: object
      properties:
        timestamp:
          type: string
          description: Date of when the error occurred
          example: '2023-12-28T09:04:17.210Z'
        path:
          type: string
          description: Path to the API endpoint
          example: /v2/transcription/45463597-20b7-4af7-b3b3-f5fb778203ab
        request_id:
          type: string
          description: Debug id
          example: G-821fe9df
        statusCode:
          type: number
          description: HTTP status code of the error
          example: 400
        message:
          type: string
          description: Error message
          example: Content-Type is missing Multipart Boundary.
        validation_errors:
          description: List of validation errors, if any
          example:
            - Field "language" must be a string
            - Field "min_speakers" must be a number
          type: array
          items:
            type: string
      required:
        - timestamp
        - path
        - request_id
        - statusCode
        - message
    UnauthorizedErrorResponse:
      type: object
      properties:
        timestamp:
          type: string
          description: Date of when the error occurred
          example: '2023-12-28T09:04:17.210Z'
        path:
          type: string
          description: Path to the API endpoint
          example: /v2/transcription/45463597-20b7-4af7-b3b3-f5fb778203ab
        request_id:
          type: string
          description: Debug id
          example: G-821fe9df
        statusCode:
          type: number
          description: HTTP status code of the error
          example: 401
        message:
          type: string
          description: Error message
          example: gladia key not found
      required:
        - timestamp
        - path
        - request_id
        - statusCode
        - message
    UnprocessableEntityErrorResponse:
      type: object
      properties:
        timestamp:
          type: string
          description: Date of when the error occurred
          example: '2023-12-28T09:04:17.210Z'
        path:
          type: string
          description: Path to the API endpoint
          example: /v2/transcription/45463597-20b7-4af7-b3b3-f5fb778203ab
        request_id:
          type: string
          description: Debug id
          example: G-821fe9df
        statusCode:
          type: number
          description: HTTP status code of the error
          example: 422
        message:
          type: string
          description: Error message
          example: Invalid parameter
      required:
        - timestamp
        - path
        - request_id
        - statusCode
        - message
    CustomVocabularyConfigDTO:
      type: object
      properties:
        vocabulary:
          type: array
          description: >-
            Specific vocabulary list to feed the transcription model with. Each
            item can be a string or an object with the following properties:
            value, intensity, pronunciations, language.
          example:
            - Westeros
            - value: Stark
            - value: Night's Watch
              pronunciations:
                - Nightz Watch
              intensity: 0.4
              language: en
          items:
            oneOf:
              - $ref: '#/components/schemas/CustomVocabularyEntryDTO'
              - type: string
        default_intensity:
          type: number
          description: Default intensity for the custom vocabulary
          example: 0.5
          minimum: 0
          maximum: 1
      required:
        - vocabulary
    CallbackConfigDto:
      type: object
      properties:
        url:
          type: string
          description: The URL to be called with the result of the transcription
          example: https://callback.example
          format: uri
        method:
          description: >-
            The HTTP method to be used. Allowed values are `POST` or `PUT`
            (default: `POST`)
          example: POST
          default: POST
          allOf:
            - $ref: '#/components/schemas/CallbackMethodEnum'
      required:
        - url
    SubtitlesConfigDTO:
      type: object
      properties:
        formats:
          type: array
          description: Subtitles formats you want your transcription to be formatted to
          default:
            - srt
          minItems: 1
          example:
            - srt
          items:
            $ref: '#/components/schemas/SubtitlesFormatEnum'
        minimum_duration:
          type: number
          description: Minimum duration of a subtitle in seconds
          minimum: 0
        maximum_duration:
          type: number
          description: Maximum duration of a subtitle in seconds
          minimum: 1
          maximum: 30
        maximum_characters_per_row:
          type: integer
          description: Maximum number of characters per row in a subtitle
          minimum: 1
        maximum_rows_per_caption:
          type: integer
          description: Maximum number of rows per caption
          minimum: 1
          maximum: 5
        style:
          description: >-
            Style of the subtitles. Compliance mode refers to :
            https://loc.gov/preservation/digital/formats//fdd/fdd000569.shtml#:~:text=SRT%20files%20are%20basic%20text,alongside%2C%20example%3A%20%22MyVideo123
          default: default
          allOf:
            - $ref: '#/components/schemas/SubtitlesStyleEnum'
    DiarizationConfigDTO:
      type: object
      properties:
        number_of_speakers:
          type: integer
          description: Exact number of speakers in the audio
          example: 3
          minimum: 1
        min_speakers:
          type: integer
          description: Minimum number of speakers in the audio
          example: 1
          minimum: 0
        max_speakers:
          type: integer
          description: Maximum number of speakers in the audio
          example: 2
          minimum: 0
    TranslationConfigDTO:
      type: object
      properties:
        target_languages:
          type: array
          description: >-
            Target language in `iso639-1` format you want the transcription
            translated to
          example:
            - en
          minItems: 1
          items:
            $ref: '#/components/schemas/TranslationLanguageCodeEnum'
        model:
          description: Model you want the translation model to use to translate
          default: base
          allOf:
            - $ref: '#/components/schemas/TranslationModelEnum'
        match_original_utterances:
          type: boolean
          description: Align translated utterances with the original ones
          default: true
        lipsync:
          type: boolean
          description: 'Whether to apply lipsync to the translated transcription. '
          default: true
        context_adaptation:
          type: boolean
          description: >-
            Enables or disables context-aware translation features that allow
            the model to adapt translations based on provided context.
          default: true
        context:
          type: string
          description: Context information to improve translation accuracy
        informal:
          type: boolean
          description: >-
            Forces the translation to use informal language forms when available
            in the target language.
          default: false
      required:
        - target_languages
    SummarizationConfigDTO:
      type: object
      properties:
        type:
          description: The type of summarization to apply
          default: general
          allOf:
            - $ref: '#/components/schemas/SummaryTypesEnum'
    CustomSpellingConfigDTO:
      type: object
      properties:
        spelling_dictionary:
          type: object
          description: The list of spelling applied on the audio transcription
          example:
            Gettleman:
              - gettleman
            SQL:
              - Sequel
          additionalProperties:
            type: array
            items:
              type: string
      required:
        - spelling_dictionary
    AudioToLlmListConfigDTO:
      type: object
      properties:
        prompts:
          description: The list of prompts applied on the audio transcription
          example:
            - Extract the key points from the transcription
          minItems: 1
          type: array
          items:
            type: array
        model:
          type: string
          description: >-
            The model to use for the prompt execution. You can find the list of
            supported models [here](https://openrouter.ai/models).
          default: openai/gpt-5.4-nano
      required:
        - prompts
    PiiRedactionConfigDTO:
      type: object
      properties:
        entity_types:
          description: The entity types to redact
          example:
            - GDPR
            - HEALTH_INFORMATION
            - HIPAA_SAFE_HARBOR
            - QUEBEC_PRIVACY_ACT
            - EMAIL_ADDRESS
            - NAME
            - PHONE_NUMBER
          allOf:
            - $ref: '#/components/schemas/PiiRedactionEntityTypeEnum'
        processed_text_type:
          type: string
          description: The type of processed text to return (marker or mask)
          enum:
            - MARKER
            - MASK
          example: MARKER
    LanguageConfig:
      type: object
      properties:
        languages:
          type: array
          description: >-
            If one language is set, it will be used for the transcription.
            Otherwise, language will be auto-detected by the model.
          default: []
          items:
            $ref: '#/components/schemas/TranscriptionLanguageCodeEnum'
        code_switching:
          type: boolean
          description: >-
            If true, language will be auto-detected on each utterance.
            Otherwise, language will be auto-detected on first utterance and
            then used for the rest of the transcription. If one language is set,
            this option will be ignored.
          default: false
    CustomVocabularyEntryDTO:
      type: object
      properties:
        value:
          type: string
          description: The text used to replace in the transcription.
          example: Gladia
        intensity:
          type: number
          description: The global intensity of the feature.
          example: 0.5
          minimum: 0
          maximum: 1
        pronunciations:
          description: The pronunciations used in the transcription.
          type: array
          items:
            type: string
        language:
          description: >-
            Specify the language in which it will be pronounced when sound
            comparison occurs. Default to transcription language.
          example: en
          allOf:
            - $ref: '#/components/schemas/TranscriptionLanguageCodeEnum'
      required:
        - value
    CallbackMethodEnum:
      type: string
      enum:
        - POST
        - PUT
      description: >-
        The HTTP method to be used. Allowed values are `POST` or `PUT` (default:
        `POST`)
    SubtitlesFormatEnum:
      type: string
      enum:
        - srt
        - vtt
      description: Subtitles formats you want your transcription to be formatted to
    SubtitlesStyleEnum:
      type: string
      enum:
        - default
        - compliance
      description: >-
        Style of the subtitles. Compliance mode refers to :
        https://loc.gov/preservation/digital/formats//fdd/fdd000569.shtml#:~:text=SRT%20files%20are%20basic%20text,alongside%2C%20example%3A%20%22MyVideo123
    TranslationLanguageCodeEnum:
      type: string
      enum:
        - af
        - am
        - ar
        - as
        - az
        - ba
        - be
        - bg
        - bn
        - bo
        - br
        - bs
        - ca
        - cs
        - cy
        - da
        - de
        - el
        - en
        - es
        - et
        - eu
        - fa
        - fi
        - fo
        - fr
        - gl
        - gu
        - ha
        - haw
        - he
        - hi
        - hr
        - ht
        - hu
        - hy
        - id
        - is
        - it
        - ja
        - jw
        - ka
        - kk
        - km
        - kn
        - ko
        - la
        - lb
        - ln
        - lo
        - lt
        - lv
        - mg
        - mi
        - mk
        - ml
        - mn
        - mr
        - ms
        - mt
        - my
        - ne
        - nl
        - nn
        - 'no'
        - oc
        - pa
        - pl
        - ps
        - pt
        - ro
        - ru
        - sa
        - sd
        - si
        - sk
        - sl
        - sn
        - so
        - sq
        - sr
        - su
        - sv
        - sw
        - ta
        - te
        - tg
        - th
        - tk
        - tl
        - tr
        - tt
        - uk
        - ur
        - uz
        - vi
        - wo
        - yi
        - yo
        - zh
      description: >-
        Target language in `iso639-1` format you want the transcription
        translated to
    TranslationModelEnum:
      type: string
      enum:
        - base
        - enhanced
      description: Model you want the translation model to use to translate
    SummaryTypesEnum:
      type: string
      enum:
        - general
        - bullet_points
        - concise
      description: The type of summarization to apply
    PiiRedactionEntityTypeEnum:
      type: string
      enum:
        - APPI
        - APPI_SENSITIVE
        - CCI
        - CORE_ENTITIES
        - CPRA
        - GDPR
        - GDPR_SENSITIVE
        - HEALTH_INFORMATION
        - HIPAA_SAFE_HARBOR
        - LIDI
        - NUMERICAL_EXCL_PCI
        - PCI
        - QUEBEC_PRIVACY_ACT
        - ACCOUNT_NUMBER
        - AGE
        - DATE
        - DATE_INTERVAL
        - DOB
        - DRIVER_LICENSE
        - DURATION
        - EMAIL_ADDRESS
        - EVENT
        - FILENAME
        - GENDER
        - HEALTHCARE_NUMBER
        - IP_ADDRESS
        - LANGUAGE
        - LOCATION
        - LOCATION_ADDRESS
        - LOCATION_ADDRESS_STREET
        - LOCATION_CITY
        - LOCATION_COORDINATE
        - LOCATION_COUNTRY
        - LOCATION_STATE
        - LOCATION_ZIP
        - MARITAL_STATUS
        - MONEY
        - NAME
        - NAME_FAMILY
        - NAME_GIVEN
        - NAME_MEDICAL_PROFESSIONAL
        - NUMERICAL_PII
        - OCCUPATION
        - ORGANIZATION
        - ORGANIZATION_MEDICAL_FACILITY
        - ORIGIN
        - PASSPORT_NUMBER
        - PASSWORD
        - PHONE_NUMBER
        - PHYSICAL_ATTRIBUTE
        - POLITICAL_AFFILIATION
        - RELIGION
        - SEXUALITY
        - SSN
        - TIME
        - URL
        - USERNAME
        - VEHICLE_ID
        - ZODIAC_SIGN
        - BLOOD_TYPE
        - CONDITION
        - DOSE
        - DRUG
        - INJURY
        - MEDICAL_PROCESS
        - STATISTICS
        - BANK_ACCOUNT
        - CREDIT_CARD
        - CREDIT_CARD_EXPIRATION
        - CVV
        - ROUTING_NUMBER
        - CORPORATE_ACTION
        - DAY
        - EFFECT
        - FINANCIAL_METRIC
        - MEDICAL_CODE
        - MONTH
        - ORGANIZATION_ID
        - PRODUCT
        - PROJECT
        - TREND
        - YEAR
      description: The entity types to redact
    TranscriptionLanguageCodeEnum:
      type: string
      enum:
        - af
        - am
        - ar
        - as
        - az
        - ba
        - be
        - bg
        - bn
        - bo
        - br
        - bs
        - ca
        - cs
        - cy
        - da
        - de
        - el
        - en
        - es
        - et
        - eu
        - fa
        - fi
        - fo
        - fr
        - gl
        - gu
        - ha
        - haw
        - he
        - hi
        - hr
        - ht
        - hu
        - hy
        - id
        - is
        - it
        - ja
        - jw
        - ka
        - kk
        - km
        - kn
        - ko
        - la
        - lb
        - ln
        - lo
        - lt
        - lv
        - mg
        - mi
        - mk
        - ml
        - mn
        - mr
        - ms
        - mt
        - my
        - ne
        - nl
        - nn
        - 'no'
        - oc
        - pa
        - pl
        - ps
        - pt
        - ro
        - ru
        - sa
        - sd
        - si
        - sk
        - sl
        - sn
        - so
        - sq
        - sr
        - su
        - sv
        - sw
        - ta
        - te
        - tg
        - th
        - tk
        - tl
        - tr
        - tt
        - uk
        - ur
        - uz
        - vi
        - yi
        - yo
        - zh
      description: >-
        Specify the language in which it will be pronounced when sound
        comparison occurs. Default to transcription language.
  securitySchemes:
    x_gladia_key:
      type: apiKey
      in: header
      name: x-gladia-key
      description: Your personal Gladia API key

````