Pre-recorded Audio-to-LLM runs once the transcription is generated. You provide one or more prompts; each prompt is executed against the transcript text from the same job using the configured model, yielding one LLM response per prompt. Use it to extract action items, answer questions about the recording, or run any text analysis you express in natural language. Unlike the built-in Summarization feature — which produces a fixed-format summary — Audio-to-LLM lets you write your own instructions: ask for a summary in the exact format, tone, and level of detail your product needs, or combine a summary with other analyses (action items, compliance checks) in a single request.Documentation Index
Fetch the complete documentation index at: https://docs.gladia.io/llms.txt
Use this file to discover all available pages before exploring further.
Usage
- Include
audio_to_llm: trueand anaudio_to_llm_configobject (at minimum, apromptsarray) in your pre-recorded transcription request. - Gladia transcribes the audio, along with any other audio-intelligence options you enabled on that request.
- Each prompt is run on the resulting transcript via the LLM.
- The API returns one result object per prompt (same order as
prompts), each containing the originalpromptand the modelresponse.
Audio-to-LLM sends plain transcript text to the model. Raw audio and other fields from the transcription response are not added to the LLM prompt context.
Model selection
By default the model used to execute your prompts is GPT 5.4 Nano (openai/gpt-5.4-nano), a fast option suited to high-volume summaries and extraction. The model can be customized when you need stronger reasoning, richer analysis, longer outputs, or behavior that fits a specific model.
You can use any model listed on OpenRouter by setting the model key. Prices reflect the public OpenRouter rate plus a platform fee added by Gladia.
Example
A single prompt is enough to get started (you can omitmodel to use the default):
Pre-recorded
Pre-recorded
Example: post-meeting workflow
For a post-meeting pass, you might ask for bullet takeaways, a short summary, and follow-up actions for the next meeting:Pre-recorded
Pre-recorded
Response shape
- Top-level
resultsis an array with one entry per prompt, in the same order asaudio_to_llm_config.prompts. - Each entry includes
success, optionalerror, timing fields, and nestedresults.prompt/results.responsewith the LLM output for that prompt.
Pricing
The input provided to the LLM is the full transcription. All prices are per 1M tokens and include platform fees.| Model | model config | Context Window | Input | Output |
|---|---|---|---|---|
| OpenAI: GPT-5.4 Nano | openai/gpt-5.4-nano | 400k | $0.26 | $1.76 |
| OpenAI: GPT-5.4 | openai/gpt-5.4 | 1.1M | $3.25 | $19.50 |
| Anthropic: Claude Opus 4.7 | anthropic/claude-opus-4.7 | 1M | $6.50 | $32.50 |
| Google: Gemini 3.1 Pro Preview | google/gemini-3.1-pro-preview | 1M | $2.60 | $15.60 |
| xAI: Grok 4.20 | x-ai/grok-4.20 | 2M | $2.60 | $7.80 |
| Meta: Llama 4 Maverick | meta-llama/llama-4-maverick | 1M | $0.20 | $0.78 |