This page outlines the known limitations of the Audio Transcription API and offers best practices for ensuring a seamless transcription experience.
- Audio Length: The maximum length of audio that can be transcribed in a single request is currently
135 minutes. Attempts to transcribe longer audio files may result in errors.
- File Size: Audio files must not exceed 500 MB in size. Larger files will not be accepted by the API.
- API Call Limits: To ensure the quality of service and fairness to all users, API call limits have been implemented. For the free tier, users can make a maximum of 20 calls per hour, with up to 3 concurrent requests. Users subscribed to the Pro tier can make up to 200 calls per minute and up to 15 concurrent requests.
For audio files that are near or exceed the limitations on length and size, it is recommended to split them into smaller chunks of ~60 minutes each. This approach not only adheres to the API constraints but also generally yields better transcription results.
- FFMPEG: FFMPEG is a versatile command-line tool that can be used to manipulate audio and video files. It is a popular choice for splitting long audio files.
- ffmpeg-python: For Python users, ffmpeg-python is a wrapper around FFMPEG that provides a more Pythonic interface for interacting with FFMPEG.
- prism-media for Node.js: Node.js users can use prism-media for manipulating media files, including splitting audio files.
- fluent-ffmpeg for Node.js: Another option for Node.js users is fluent-ffmpeg, which offers a simpler and more fluent API for handling media files.
Following these best practices will help you avoid issues due to limitations and maximize the quality of the transcriptions you obtain from the Audio Transcription API.