Skip to Content

Get Transcript

This API gets transcript/subtitles of a video hosted on YouTube, TikTok, Instagram, X (Twitter) or a public file URL.

Quick Start

Request

curl -X GET 'https://api.supadata.ai/v1/transcript?url=https://youtu.be/dQw4w9WgXcQ' \ -H 'x-api-key: YOUR_API_KEY'

Response

{ "content": "Never gonna give you up, never gonna let you down...", "lang": "en", "availableLangs": ["en", "es", "zh-TW"] }

Specification

Endpoint

GET https://api.supadata.ai/v1/transcript

Each request requires an x-api-key header with your API key available after signing up. Find out more about Authentication.

Query Parameters

ParameterTypeRequiredDescription
urlstringYesURL of the video to get transcript from. Must be either YouTube, TikTok, X (Twitter) or a public file URL. It is recommended to encode the URL before sending it as a query parameter.
langstringNoPreferred language code of the transcript (ISO 639-1). See Languages.
textbooleanNoWhen true, returns plain text transcript. Default: false
chunkSizenumberNoMaximum characters per transcript chunk (only when text=false)
modestringNoTranscript mode: native (only fetch existing transcript), generate (always generate transcript using AI), or auto (try native, fallback to generate if unavailable). If url is a file URL, mode is always generate. Default: auto.

To fetch only existing transcripts and avoid costs tied to AI generation, use mode=native.

Response Format

The API can return either a transcript result directly (HTTP 200) or a job ID for asynchronous processing (HTTP 202).

For large videos that require processing time, the API returns HTTP 202 with a job ID. Use the /transcript/{jobId} endpoint to poll for results.

Immediate transcript response (HTTP 200):

When text=true:

{ "content": string, "lang": string // ISO 639-1 language code "availableLangs": string[] // List of available languages }

When text=false:

{ "content": [ { "text": string, // Transcript segment "offset": number, // Start time in milliseconds "duration": number, // Duration in milliseconds "lang": string // ISO 639-1 language code of chunk } ], "lang": string // ISO 639-1 language code of transcript "availableLangs": string[] // List of available languages }

Asynchronous job response (HTTP 202):

{ "jobId": string // Job ID for checking results }

Getting Job Results

When the API returns a job ID, you can poll for results using the job ID endpoint:

Check Job Status

GET https://api.supadata.ai/v1/transcript/{jobId}

Example Request

curl -X GET 'https://api.supadata.ai/v1/transcript/123e4567-e89b-12d3-a456-426614174000' \ -H 'x-api-key: YOUR_API_KEY'

Response

{ "status": "completed", "result": { "content": "Never gonna give you up, never gonna let you down...", "lang": "en", "availableLangs": ["en", "es", "zh-TW"] } }

Job Status Values

StatusDescription
queuedThe job is in the queue waiting to be processed
activeThe job is currently being processed
completedThe job has finished and results are available
failedThe job failed due to an error

Poll the job status endpoint until the status is either “completed” or “failed”. The result field will contain the transcript data when status is “completed”, or the error field will contain error details when status is “failed”.

Error Codes

The API returns HTTP status codes and error codes. See this page for more details.

Supported URL Formats

url parameter supports the following:

  • YouTube video URL, e.g. https://www.youtube.com/watch?v=1234567890
  • TikTok video URL, e.g. https://www.tiktok.com/@username/video/1234567890
  • X (Twitter) video URL, e.g. https://x.com/username/status/1234567890
  • Instagram Reel video URL, e.g. https://www.instagram.com/reel/1234567890
  • Publicly accessible file URL, e.g. https://bucket.s3.eu-north-1.amazonaws.com/file.mp4

File Transcripts

When url is a file URL, the endpoint supports the following file formats:

  • MP4
  • WEBM
  • MP3
  • FLAC
  • MPEG
  • M4A
  • OGG
  • WAV

The maximum file size is 1 GB. There is no limit on the video duration.

Languages

The endpoint supports multiple languages. The lang parameter is used to specify the preferred language of the transcript. If the video does not have a transcript in the preferred language, the endpoint will return a transcript in the first available language and a list of other available languages. It is then possible to make another request to get the transcript in your chosen fallback language.

When mode = generate, the lang parameter is ignored and the transcript is generated in the language of the video.

Pricing

  • 1 native transcript = 1 credit
  • 1 generated transcript minute = 2 credits