Skip to main content

Documentation Index

Fetch the complete documentation index at: https://docs.gladia.io/llms.txt

Use this file to discover all available pages before exploring further.

Pre-recorded Audio-to-LLM runs once the transcription is generated. You provide one or more prompts; each prompt is executed against the transcript text from the same job using the configured model, yielding one LLM response per prompt. Use it to extract action items, answer questions about the recording, or run any text analysis you express in natural language. Unlike the built-in Summarization feature — which produces a fixed-format summary — Audio-to-LLM lets you write your own instructions: ask for a summary in the exact format, tone, and level of detail your product needs, or combine a summary with other analyses (action items, compliance checks) in a single request.

Usage

  1. Include audio_to_llm: true and an audio_to_llm_config object (at minimum, a prompts array) in your pre-recorded transcription request.
  2. Gladia transcribes the audio, along with any other audio-intelligence options you enabled on that request.
  3. Each prompt is run on the resulting transcript via the LLM.
  4. The API returns one result object per prompt (same order as prompts), each containing the original prompt and the model response.
Audio-to-LLM sends plain transcript text to the model. Raw audio and other fields from the transcription response are not added to the LLM prompt context.

Model selection

By default the model used to execute your prompts is GPT 5.4 Nano (openai/gpt-5.4-nano), a fast option suited to high-volume summaries and extraction. The model can be customized when you need stronger reasoning, richer analysis, longer outputs, or behavior that fits a specific model. You can use any model listed on OpenRouter by setting the model key. Prices reflect the public OpenRouter rate plus a platform fee added by Gladia.

Example

A single prompt is enough to get started (you can omit model to use the default):
Pre-recorded
{
  "audio_to_llm": true,
  "audio_to_llm_config": {
    "prompts": [
      "Summarize the transcript in three bullet points."
    ]
  }
}
Example response shape for one prompt:
Pre-recorded
{
  "success": true,
  "is_empty": false,
  "results": [
    {
      "success": true,
      "is_empty": false,
      "results": {
        "prompt": "Summarize the transcript in three bullet points.",
        "response": "- Intro and context\n- Main discussion\n- Conclusion and next steps"
      },
      "exec_time": 1.4122809978485107,
      "error": null
    }
  ],
  "exec_time": 4.521103805541992,
  "error": null
}

Example: post-meeting workflow

For a post-meeting pass, you might ask for bullet takeaways, a short summary, and follow-up actions for the next meeting:
Pre-recorded
{
  "audio_to_llm": true,
  "audio_to_llm_config": {
    "model": "openai/gpt-5.4",
    "prompts": [
      "Summarize the meeting as bullet points: main topics, decisions, and open questions.",
      "Give a concise paragraph summarizing what this meeting was about and the outcome.",
      "List action items and follow-ups to prepare for the next meeting; include owners if they were mentioned."
    ]
  }
}
With this configuration, your output might look like this:
Pre-recorded
{
  "success": true,
  "is_empty": false,
  "results": [
    {
      "success": true,
      "is_empty": false,
      "results": {
        "prompt": "Summarize the meeting as bullet points: main topics, decisions, and open questions.",
        "response": "- **Roadmap Q2**: Team aligned on shipping the billing integration first.\n- **Decision**: Weekly sync moved to Tuesday.\n- **Open question**: Whether to support SSO in v1 is still TBD."
      },
      "exec_time": 1.7726809978485107,
      "error": null
    },
    {
      "success": true,
      "is_empty": false,
      "results": {
        "prompt": "Give a concise paragraph summarizing what this meeting was about and the outcome.",
        "response": "The group reviewed Q2 priorities, agreed to prioritize billing, and rescheduled the standing meeting. SSO scope was left for a follow-up once design signs off."
      },
      "exec_time": 1.5122809978485107,
      "error": null
    },
    {
      "success": true,
      "is_empty": false,
      "results": {
        "prompt": "List action items and follow-ups to prepare for the next meeting; include owners if they were mentioned.",
        "response": "- **Alex**: Finalize SSO requirements doc by Friday.\n- **Jamie**: Share billing API cutover checklist with the team.\n- **Everyone**: Review the updated roadmap draft before next sync."
      },
      "exec_time": 1.8932809978258485,
      "error": null
    }
  ],
  "exec_time": 6.267103805541992,
  "error": null
}

Response shape

  • Top-level results is an array with one entry per prompt, in the same order as audio_to_llm_config.prompts.
  • Each entry includes success, optional error, timing fields, and nested results.prompt / results.response with the LLM output for that prompt.

Pricing

The input provided to the LLM is the full transcription. All prices are per 1M tokens and include platform fees.
Modelmodel configContext WindowInputOutput
OpenAI: GPT-5.4 Nanoopenai/gpt-5.4-nano400k$0.26$1.76
OpenAI: GPT-5.4openai/gpt-5.41.1M$3.25$19.50
Anthropic: Claude Opus 4.7anthropic/claude-opus-4.71M$6.50$32.50
Google: Gemini 3.1 Pro Previewgoogle/gemini-3.1-pro-preview1M$2.60$15.60
xAI: Grok 4.20x-ai/grok-4.202M$2.60$7.80
Meta: Llama 4 Maverickmeta-llama/llama-4-maverick1M$0.20$0.78