How to scrape YouTube video metadata without yt-dlp
Scrape YouTube metadata without yt-dlp by parsing ytInitialPlayerResponse. Extract title, views, duration, captions via HTTP at 1 credit per request.
yt-dlp breaks roughly once a month when YouTube rotates its signature cipher, and rate limits start producing 429s after [TODO: insert production stat from logs] requests per hour per IP. For metadata-only scraping — title, views, duration, upload date, captions list — you don't need yt-dlp at all. YouTube embeds all of that as inline JSON in every public watch page under a variable called ytInitialPlayerResponse. Parse that and you're done: 1 credit per request, ~400ms, no auth, no signature decryption, no player JS execution.
This post shows exactly how to extract it, which fields survive, where the approach breaks, and when you should go back to yt-dlp anyway.
Why yt-dlp breaks and when HTTP scraping works instead
yt-dlp's core job is format extraction and download. To build a working video URL, it has to decrypt a per-request signature that YouTube's player JavaScript generates on load. YouTube rewrites that player JS frequently — roughly every [TODO: insert production stat from logs] weeks — and each rewrite can change the cipher's variable names, operation order, or obfuscation layer. When that happens, yt-dlp's regex-based extractor throws a Signature Cipher Timeout or silently returns unplayable URLs until the maintainers ship a patch.
That cost is worth paying if you actually need the video file. If all you want is the title, view count, duration, upload date, and captions list, you're paying the signature-cipher tax for nothing.
The second issue is rate limits. yt-dlp hits 429s after [TODO: insert production stat from logs] requests per hour against a single IP because it pulls the watch page, the player JS, and the innertube API in sequence for each video. That's 3x the request volume per video compared to a single HTML fetch. Scraping the watch page directly through a rotating IP pool (DreamScrape's HTTP tier does this automatically) keeps the request volume 1:1 with video count and the per-IP rate low enough to avoid 429s at [TODO: insert production stat from logs] videos per hour.
Authentication is not required for any of this. Public watch pages render ytInitialPlayerResponse server-side for SEO. Google wants crawlers to see it. We're taking them up on the offer.
Where yt-dlp is still the right tool:
- Private or unlisted videos (you need a session cookie anyway)
- Member-only content and age-gated videos with login
- Full format/codec enumeration for downloads
- Comment scraping and engagement metrics
- Playlist ordering where member-only items affect position
If your job is any of those, stop reading and stick with yt-dlp. Everything below is for bulk metadata on public videos.
Finding ytInitialPlayerResponse in the HTML and parsing it
The watch page HTML is roughly [TODO: insert production stat from logs] KB. Somewhere in the middle of the body (not in <head>) YouTube injects a <script> tag that looks like this:
<script nonce="...">var ytInitialPlayerResponse = {"responseContext":{...},"playabilityStatus":{...},"videoDetails":{"videoId":"...","title":"...","lengthSeconds":"543","viewCount":"1234567",...},"captions":{...}};var meta = ...</script>Everything you care about lives inside that JSON object:
videoDetails.title— the video title as a stringvideoDetails.videoId— the 11-char IDvideoDetails.lengthSeconds— duration, as a string of seconds (yes, a string)videoDetails.viewCount— view count, also a stringvideoDetails.author— the channel namevideoDetails.keywords— array of tag strings, sometimes absentvideoDetails.shortDescription— the description textvideoDetails.thumbnail.thumbnails— array of{url, width, height}at multiple resolutionsmicroformat.playerMicroformatRenderer.publishDate— ISO upload date (videoDetailsitself doesn't carry the date; look here)captions.playerCaptionsTracklistRenderer.captions— array of caption tracks, each withname.simpleTextandlanguageCode
The extraction is a regex followed by a JSON parse. The one rule that trips up most implementations: use a non-greedy match and anchor on the closing }; followed by var, not just };. The naive pattern var ytInitialPlayerResponse = (\{.*\}); will swallow the rest of the document. Better:
PATTERN = r'var ytInitialPlayerResponse\s*=\s*(\{.+?\});(?:var|</script>)'This still fails on edge cases where the description contains the exact substring };var. For bulletproof extraction, walk the string character by character and track brace depth starting from the { after the =. That costs ~2ms per page and eliminates the class of errors entirely.
If the inline JSON isn't present — YouTube occasionally serves a simplified page for unusual requests — fall back to the standard Open Graph tags: og:title, og:image, og:video:duration. That gets you the three most important fields without the full JSON.
Working code example: scrape title, views, duration, and captions
Here's a complete working example using DreamScrape's HTTP tier (1 credit, ~400ms):
import json
import re
import requests
from typing import Optional
DREAMSCRAPE_API_KEY = "your_key_here"
PATTERN = re.compile(r'var ytInitialPlayerResponse\s*=\s*(\{.+?\});(?:var|</script>)', re.DOTALL)
def scrape_youtube_metadata(video_id: str) -> Optional[dict]:
url = f"https://www.youtube.com/watch?v={video_id}"
response = requests.post(
"https://dreamscrape.app/scrape",
headers={"Authorization": f"Bearer {DREAMSCRAPE_API_KEY}"},
json={
"url": url,
"engine": "http",
"headers": {
"User-Agent": "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/131.0.0.0 Safari/537.36",
"Accept-Language": "en-US,en;q=0.9",
}
},
timeout=10,
)
if response.status_code != 200:
return None
html = response.json().get("content", "")
match = PATTERN.search(html)
if not match:
return oembed_fallback(video_id)
try:
data = json.loads(match.group(1))
except json.JSONDecodeError:
return oembed_fallback(video_id)
video_details = data.get("videoDetails", {})
microformat = data.get("microformat", {}).get("playerMicroformatRenderer", {})
captions_root = data.get("captions", {}).get("playerCaptionsTracklistRenderer", {})
caption_tracks = captions_root.get("captionTracks", [])
return {
"video_id": video_details.get("videoId"),
"title": video_details.get("title"),
"view_count": video_details.get("viewCount"),
"duration_seconds": video_details.get("lengthSeconds"),
"author": video_details.get("author"),
"upload_date": microformat.get("publishDate"),
"thumbnails": video_details.get("thumbnail", {}).get("thumbnails", []),
"captions": [
{
"name": t.get("name", {}).get("simpleText"),
"language_code": t.get("languageCode"),
"url": t.get("baseUrl"),
}
for t in caption_tracks
],
}
def oembed_fallback(video_id: str) -> Optional[dict]:
url = f"https://www.youtube.com/oembed?url=https://www.youtube.com/watch?v={video_id}&format=json"
try:
r = requests.get(url, timeout=5)
if r.status_code != 200:
return None
data = r.json()
return {
"video_id": video_id,
"title": data.get("title"),
"author": data.get("author_name"),
"thumbnails": [{"url": data.get("thumbnail_url")}],
"view_count": None,
"duration_seconds": None,
"upload_date": None,
"captions": [],
}
except (requests.RequestException, json.JSONDecodeError):
return NoneExample output for a public video:
{
"video_id": "dQw4w9WgXcQ",
"title": "Example Video",
"view_count": "1234567",
"duration_seconds": "543",
"author": "Example Channel",
"upload_date": "2009-10-25",
"thumbnails": [{"url": "https://i.ytimg.com/...", "width": 168, "height": 94}, ...],
"captions": [
{"name": "English", "language_code": "en", "url": "https://..."},
{"name": "Spanish", "language_code": "es", "url": "https://..."}
]
}For batch jobs, fire requests in parallel with a thread pool capped at 10. Ten videos in parallel finish in roughly [TODO: insert production stat from logs] seconds for 10 credits total. See the DreamScrape YouTube intel page for current routing stats and the per-field extraction success rate.
The /oembed public endpoint as a lightweight fallback
YouTube exposes a public oEmbed endpoint at https://www.youtube.com/oembed?url=https://www.youtube.com/watch?v={video_id}&format=json. It responds in under [TODO: insert production stat from logs] ms, has no observable rate limit at moderate volume, and returns valid JSON without any HTML parsing.
What you get:
{
"title": "Example Video",
"author_name": "Example Channel",
"author_url": "https://www.youtube.com/@examplechannel",
"thumbnail_url": "https://i.ytimg.com/vi/.../hqdefault.jpg",
"width": 200,
"height": 113,
"html": "<iframe...></iframe>"
}What you don't get: view count, duration, upload date, captions list, tags, description. It's a strict subset. Use it when the inline JSON extraction fails — unlisted-turned-public videos, geo-serving oddities, or the occasional DOM change where the regex misses but the video is still public.
Common errors: why extraction fails and how to fix each one
json.JSONDecodeError after regex match. The regex captured trailing JavaScript code past the real object boundary. Switch from a greedy .* to the non-greedy pattern above, or walk brace depth manually. Log the first 200 chars of the captured string when this fires — you'll usually see the issue immediately.
Empty viewCount or lengthSeconds field. Age-restricted videos strip videoDetails.viewCount even on the public HTML. Check for key presence before reading: video_details.get("viewCount") returns None rather than raising. For age-restricted content, fall back to oembed for title/author and leave numeric fields null.
captionTracks array is empty or missing. The video has no captions, or the uploader disabled them. data.get("captions", {}).get("playerCaptionsTracklistRenderer", {}).get("captionTracks", []) gives you an empty list on every layer of missing nesting. Never index into it without a length check.
HTTP 429 Too Many Requests. You hit the same IP too fast. DreamScrape's HTTP tier rotates IPs automatically, so this is rare — but it happens if you hammer the same video ID. Add a 1-2 second delay between requests to the same video and vary the User-Agent across a pool of 3-5 real Chrome values. If you're seeing 429s at under [TODO: insert production stat from logs] requests per minute across different videos, open a support ticket because something's wrong with the rotation.
Timeout during page load. Set timeout=10 on the request. On timeout, retry with exponential backoff: 1s, 2s, 4s, then give up and mark the video for reprocessing on the next batch run. Never retry more than 3 times — persistent timeouts mean the video is gone.
Regex returns None. Either YouTube changed the variable name (rare, but it happened once in 2024) or the response isn't a standard watch page (login wall, captcha, or 404 disguised as 200). Before running the regex, check that the response body contains the literal string ytInitialPlayerResponse. If not, log the first 500 bytes of the response and fall through to oembed.
When this approach fails: scope limitations
Be honest about what the HTML parser can't do.
Private and unlisted videos return a 404 or a login interstitial. The inline JSON is not present. oembed also fails. You need an authenticated session (yt-dlp with cookies) or the official API.
Age-restricted videos serve a partially-stripped ytInitialPlayerResponse. You'll get title and author, but viewCount, lengthSeconds, and often captions are missing or null. Detect this by checking playabilityStatus.status — if it's LOGIN_REQUIRED or AGE_VERIFICATION_REQUIRED, skip.
Removed or privated videos return a 200 with an error page. Check playabilityStatus.status == "ERROR" to detect and skip cleanly.
View count freshness is not real-time. The static HTML viewCount lags the live count by 5-60 minutes depending on the video's popularity. If you need minute-accuracy view counts, the official API's videos.list endpoint is the only source.
Likes, dislikes, comment counts, and community posts are fetched client-side via separate innertube API calls that require a valid visitor cookie and signed request token. Not available in the static HTML. yt-dlp handles these; this approach doesn't.
Geographic and age restrictions can cause the HTTP tier to return 403 for specific videos regardless of IP rotation. Accept the 403, log the video ID, and move on.
Future risk: if YouTube ever moves metadata to pure client-side rendering, the regex approach dies. This is unlikely because SEO crawlers need server-rendered metadata, but it's a real dependency you should monitor.
For video downloads, format extraction, and comment scraping, yt-dlp is still the right tool. This approach wins specifically for bulk metadata on public videos at scale.
Production checklist: reliability, rate limits, scaling
Rate limits. YouTube tolerates roughly 1 request per 2-3 seconds per IP before 429s start appearing. DreamScrape's HTTP tier rotates through a pool of [TODO: insert production stat from logs] IPs, so the per-IP rate stays well under that threshold even at 10 requests per second aggregate.
Batch strategy. Queue video IDs in a Redis list or equivalent. Pull 10 at a time, fire them in parallel with a ThreadPoolExecutor(max_workers=10), add 500-1500ms jitter between batches. Cap total concurrency at 10 parallel tasks regardless of available CPU.
Caching. Store results in Postgres or Redis with a 24-hour TTL for view counts and upload dates, and a 30-day TTL for captions (they change rarely). Title and author can cache indefinitely unless you're tracking renames.
Monitoring. Log every json.JSONDecodeError and regex miss to a separate stream. Alert when the parse failure rate exceeds [TODO: insert production stat from logs]% over a 1-hour window — that's the early warning for a DOM change.
Cost at 1,000 videos. 1,000 credits at HTTP tier, roughly [TODO: insert production stat from logs] at the Pro plan's effective rate. Runtime is ~[TODO: insert production stat from logs] minutes with 10-way parallelism.
Fallback chain. Try ytInitialPlayerResponse first. On any parse error, fall through to oembed. If oembed also fails, mark the video as unscrapable with a reason code (private, removed, geo_blocked, parse_failed) and skip for 24 hours before retrying.
Comparison: HTTP parsing vs yt-dlp vs official API
| Tool | Metadata fields | Auth | Rate limit | Cost | Latency | 6-month uptime |
|---|---|---|---|---|---|---|
| HTTP parsing (this post) | title, views, duration, captions list, upload date, thumbnails | None | ~1 req/2s per IP, rotated | $0.001/req (HTTP tier) | ~400ms | [TODO: insert production stat from logs]% |
| yt-dlp | all of the above + formats, download URLs, comments | None (cookies for private) | ~1 req/5s effective | Free (self-hosted compute) | 2-5s | [TODO: insert production stat from logs]% |
| YouTube Data API v3 | all of the above + channel info, engagement, playlist order | API key | 10,000 units/day default quota | Free within quota | ~300ms | 99.95% |
Decision rule. Use HTTP parsing for bulk metadata on public videos when you need more than 10,000 requests per day (the API quota) and don't need video files or comments. Use yt-dlp when you need to download the actual video or extract full format information. Use the official YouTube Data API for production applications with under 10,000 daily requests where you want Google's SLA and don't want to own the parsing fragility.
If you're scraping metadata across more than a few hundred videos per day, start with HTTP parsing. Check the DreamScrape YouTube intel page for the current routing recommendation and recent parse success rate before you commit to a pipeline.