Technical

How to bypass DataDome in 2026 (what they're checking now)

Bypass DataDome in 2026 with Camoufox + residential proxies. ~400 behavioral signals, Poisson delays, working code, and where this stops working.

Curtis Vaughan12 min read

DataDome shipped a detection update in February 2026 that flags uniform throttling as bot-like. If your scraper sleeps for exactly 2 seconds between requests, you're now more suspicious than a scraper with no delay at all. That's the 2026 shift in one sentence, and it breaks most of the DataDome-bypass tutorials still indexed on Google.

This post covers what DataDome's roughly 400 behavioral signals actually check, why residential proxies alone aren't enough, and the Camoufox + residential setup that hits roughly 88-92% success in our production logs across DataDome-protected targets — with a 15-25 point drop after each fingerprint update before we re-pin. Working code, named failure modes, and an honest section on where this approach stops working.

What DataDome Actually Checks in 2026

DataDome's challenge script runs in roughly 200-400ms after page load on Etsy specifically (we measured against the 18-request Etsy sample in our last-60-day routing data; the broader DataDome distribution is similar). In that window it captures around 400 signals across three categories.

Timing signals — the highest-sensitivity bucket as of Feb 2026. DataDome now fits a distribution to your inter-request delays, your time-on-page, and the gaps between JS event firings (mousemove, scroll, focus). If the distribution has zero variance, or if it's uniform across a fixed range, you score as a bot. Real humans produce Poisson-shaped distributions: most actions cluster near a mean, with a long tail of slow ones. A scraper sleeping time.sleep(2) produces a delta function. A scraper using random.uniform(1, 3) produces a flat box. Both fail the new check.

Input signals — mouse movement entropy, scroll physics (acceleration curves, not constant velocity), keystroke dwell times, and on mobile, touch event presence and pressure. A headless Chrome with no input simulation produces zero entropy across all of these. Camoufox produces randomized but plausible patterns at the C++ level, before any JS can observe them.

Network signals — TLS JA4 fingerprint, HTTP/2 frame ordering, and IP reputation. IP reputation alone is roughly a quarter of the detection stack, which is why "I rotated to fresh residential proxies and still got blocked" is the most common bug report we see.

Signal categoryDetection sensitivityWhat humans actually do
Inter-request timing distributionHighPoisson-ish, mean ~1-3s, long tail
Mouse entropyHighHigh variance, curved paths, jitter
Scroll physicsHighAcceleration + deceleration, not constant
Touch events (mobile)HighPresent, with pressure variance
TLS JA4MediumMatches a real browser version
Keystroke dwell timeMedium80-200ms with variance
IP reputationMedium-lowResidential, but not pool-burned
User agent stringLowMatches the rest of the fingerprint

The named failure modes that follow from this: constant 2-second sleep calls, uniform random delays in a fixed range, missing touch events when claiming to be a mobile browser, and zero-variance click timing. Every one of these is a single bit DataDome reads off your traffic before your scraper has rendered the first byte of HTML.

Why Standard Residential Proxies Aren't Enough

The pattern we see at least once a week in support: a customer buys a residential proxy plan, points requests or Scrapy at a DataDome site, and gets blocked on request two. They assume the proxy is bad. The proxy is fine. The problem is that DataDome flagged them on signals that fired before the IP was even relevant.

requests and Scrapy with default settings produce a TLS JA4 that doesn't match any real browser. They send headers in an order no browser sends them in. They execute zero JavaScript, so they generate zero behavioral signals — which is itself a signal. By the time DataDome's IP-reputation check runs, you've already failed three other layers.

In our routing data, residential-proxy-only attempts (no browser, no behavior simulation) get caught on DataDome sites at a rate close to 100%. The IP rotation is doing nothing because the IP wasn't the problem.

This is the same threat-model split we covered in Anti-bot detection in 2026. Cloudflare's bot score is mostly TLS + JS challenge. DataDome's is mostly behavioral distribution + browser fingerprint. They look similar from the outside (a 403 is a 403) but they reward completely different bypasses. Throwing residential proxies at a behavioral check is like fixing a memory leak with a faster CPU.

If you're scraping a DataDome target, the order of operations is: real browser first, residential second, delay distribution third. Skip any of the three and you're well under 50% on the targets we've measured.

Camoufox + Residential Setup: The 88% Production Baseline

Camoufox is a patched Firefox build that randomizes browser fingerprints (canvas, WebGL, fonts, audio context) and input event patterns at the C++ layer. The patches don't show up in JS-level introspection because they're not implemented in JS. This is the key difference between Camoufox and playwright-extra with stealth plugins, which patch from inside the JS runtime and leave detectable traces.

Production numbers from our last-60-day routing data across DataDome-protected pages (Etsy is our primary anchor at 18 requests; broader pattern matches our wider customer routing):

  • Camoufox + residential proxy + Poisson delays: roughly 88-92% success (94.4% on the Etsy sample, early signal)
  • Stealth Playwright + residential proxy: roughly 30-40% success — passes the JA4 check but fails Layer 2
  • Plain HTTP + residential proxy: near 0% success — fails at the TLS handshake

Camoufox costs 22 credits per page on DreamScrape (10 for the Camoufox engine, 10 for residential proxy, 2 for the optional challenge solver). On our Starter plan ($19 / 50,000 credits), that's roughly $0.0084 per successful scrape. Before you commit, do the ROI calculation: at 22 credits/page, 10,000 successful pages costs roughly $84 on Starter or $52 on Pro. If the data you're collecting is worth less than that, use a different target or buy the data from a provider.

The three non-negotiables:

  1. Camoufox browser instance. Not stealth Playwright, not Puppeteer with extra plugins. The behavioral check fingerprints those.
  2. Residential proxy rotation. Datacenter IPs fail at roughly 100%. Residential succeeds because residential IPs have human traffic mixed in, raising the baseline reputation score.
  3. Poisson-distributed inter-request delays. Not time.sleep(2). Not random.uniform(1, 3). A Poisson distribution with lambda ≈ 1.5 means most delays cluster around 1.5 seconds, but you get occasional 4-5 second gaps and occasional 0.3 second bursts. That shape matches human browsing.

The delay math: in numpy, np.random.poisson(1.5) returns an integer count, but for delays you want continuous time. Use np.random.exponential(1.5) (the inter-arrival time of a Poisson process) instead. The mean is 1.5s, the variance is 2.25s², and the distribution has the long tail DataDome's check expects.

Working Code Example: Scraping a DataDome Site with Camoufox

code
import numpy as np
import logging
from camoufox.sync_api import Camoufox
from playwright.sync_api import TimeoutError as PlaywrightTimeout
 
logging.basicConfig(level=logging.INFO, format="%(asctime)s %(levelname)s %(message)s")
log = logging.getLogger("scraper")
 
PROXY_POOL = [
    "http://user:pass@residential-1.proxy.example:8000",
    "http://user:pass@residential-2.proxy.example:8000",
    # ...rotate across 20+ residential endpoints (we use ~50 in production)
]
 
def poisson_delay(lam=1.5):
    """Exponential inter-arrival time. Mean=lam seconds, long tail."""
    return float(np.random.exponential(lam))
 
def scrape_page(url, proxy):
    with Camoufox(
        proxy={"server": proxy},
        humanize=True,        # randomized mouse paths + scroll physics
        os=["windows", "macos"],
        block_webrtc=True,
    ) as browser:
        page = browser.new_page()
        try:
            response = page.goto(url, wait_until="networkidle", timeout=30000)
            status = response.status if response else 0
 
            if status == 403:
                body = page.content()
                if "datadome" in body.lower() or "captcha-delivery" in body.lower():
                    raise RuntimeError("DataDome hard block (403)")
                raise RuntimeError(f"Generic 403 on {url}")
 
            if status == 429:
                raise RuntimeError("Rate limit (429) — increase lambda")
 
            html = page.content()
            return {"url": url, "status": status, "html": html}
 
        except PlaywrightTimeout:
            raise RuntimeError("Timeout — likely slow residential path")
 
def run(urls):
    proxy_idx = 0
    for url in urls:
        proxy = PROXY_POOL[proxy_idx % len(PROXY_POOL)]
        try:
            result = scrape_page(url, proxy)
            log.info(f"OK  {url}  status={result['status']}  proxy={proxy_idx}")
        except RuntimeError as e:
            log.warning(f"FAIL {url}  reason={e}  proxy={proxy_idx}")
            proxy_idx += 1  # rotate on failure
        delay = poisson_delay(1.5)
        log.info(f"sleep {delay:.2f}s")
        import time; time.sleep(delay)
        proxy_idx += 1  # rotate every request
 
if __name__ == "__main__":
    targets = ["https://example-protected-site.com/listing/123"]
    run(targets)

A few details worth flagging. humanize=True is the Camoufox flag that turns on randomized mouse paths and scroll-physics simulation — without it you ship a Camoufox browser with zero input entropy, which trips the behavioral check. The block_webrtc=True flag prevents WebRTC IP leak (your residential proxy hides your real IP at the TCP layer; WebRTC can leak it through STUN). The exception parsing on 403 lets you distinguish DataDome blocks from generic origin errors so your retry logic doesn't waste credits.

Common Failures and How to Fix Them

Failure #1: "I used Camoufox but still got blocked." Diagnostic checklist: are you on residential (not datacenter)? Is humanize=True set? Is your inter-request delay non-zero? If all three are correct and you're still blocked on the first request, your residential pool is burned — try a different provider.

Failure #2: "Random delays work once, then fail." You're rotating delays but not the proxy. Same IP + same TLS fingerprint across N requests = session-velocity flag, even with varying delays. Fix: rotate the proxy every request (proxy_idx += 1 in the example above) and let Camoufox spawn a fresh browser context per request so the TLS session ID rotates too.

Failure #3: "My delays are random but I still get caught." Check the distribution shape, not just the variance. random.uniform(1, 3) is random but uniform — flat between 1 and 3, zero density elsewhere. A Poisson/exponential distribution has density everywhere with a peak near the mean. Plot a histogram of 1000 samples from each: uniform looks like a box, exponential looks like a slide. DataDome's check sees the box and flags it.

code
# Wrong — uniform box
import random
delay = random.uniform(1, 3)
 
# Right — exponential, long-tailed
import numpy as np
delay = np.random.exponential(1.5)

Failure #4: "Mobile version works, desktop blocked." Camoufox needs explicit touch event simulation when claiming to be mobile. Add os=["android"] and humanize=True together, and verify navigator.maxTouchPoints returns nonzero in the page context. If it returns 0, your mobile spoof is incomplete and DataDome catches the mismatch.

Where Camoufox + Residential Fails (and What Doesn't Work)

The 88-92% success rate is across a broad set of DataDome-protected sites. The remaining 8-12% concentrates in specific categories where this approach is fundamentally outmatched.

High-volume scraping against velocity rules. DataDome tracks request rate per residential subnet, not just per IP. If you push 1,000 requests/hour through a residential pool, you'll get caught even with perfect behavior because the aggregate velocity is non-human. Camoufox is behavioral; it doesn't fix account-level or subnet-level velocity flags. For volume above roughly 1,000 requests/hour against a single DataDome target, no setup we've tested works reliably.

Financial sites with live browser attestation. Some banking and brokerage sites layer DataDome with Akamai Bot Manager and a hardware attestation check (TPM-backed, in some cases). Camoufox doesn't ship with a real TPM, and attestation fails. Success rate here is near zero regardless of configuration.

Sites with canvas fingerprinting + entropy correlation. A few sites (we've seen this on two ticketing platforms) cross-reference your canvas fingerprint against your TLS JA4. Camoufox randomizes canvas per session, but if the randomized canvas doesn't match the JA4's claimed browser version, the correlation check fires. Fix is non-trivial and we don't have a clean solution yet.

The cost-ceiling check. At 22 credits/page, scraping 10,000 pages costs roughly $84 on Starter or $52 on Pro. If the target blocks after 500 pages because of velocity rules, you've spent roughly $4 for partial data — small in absolute terms, but you also have a flagged residential session and no path forward without re-engineering. If the data is available via the site's official API, a partner data provider, or a public dataset, that's almost always cheaper than engineering around DataDome at scale. We tell customers this directly when we see their target — losing the sale on this scrape is better than churning them in two months.

Tuning Delay Distributions and Proxy Rotation for Your Target

The 88% baseline assumes default knobs. Three are worth tuning per-target.

Lambda (mean delay). Higher lambda = longer mean delays = higher success but lower throughput. Production data:

Lambda (s)Success rateThroughput (pages/hour)
0.8~70%~2,400
1.5 (default)~88%~1,600
2.0~91%~1,300
3.5~92%~800

These are directional from our calibration runs against DataDome targets, not a controlled benchmark — the success rate plateau between 2.0 and 3.5 is the part to remember, not the exact percentages.

Proxy rotation frequency. Per-request rotation is the safe default. For sites with weak velocity rules you can reuse a proxy for 5-10 requests and save on session-setup overhead. For sites with strict rules (e-commerce checkout flows, ticketing) rotate every request and spawn a new Camoufox context too.

Camoufox browser reuse. Spawning a fresh Camoufox instance per page is expensive (roughly 1,500-2,500ms startup in our routing data). Reusing across 10-20 pages is fine if you also rotate proxy and clear cookies between pages. Beyond ~50 pages per instance, fingerprint staleness shows up and success rate drops.

Target typeLambdaRotationCamoufox reuseEmpirical success
News / content sites1.0Every 5 requests20 pages~95%
E-commerce listings1.5Every request10 pages~88-92%
Ticketing / drops2.5Every request1 page~75%
SaaS marketing pages1.2Every 3 requests15 pages~93%

Parameterize so you can tune without redeploying:

code
class ScrapeConfig:
    lam: float = 1.5
    rotate_every: int = 1
    camoufox_reuse: int = 10
 
def delay(cfg: ScrapeConfig):
    return float(np.random.exponential(cfg.lam))

What works on Site A may fail on Site B. Run 100-request calibration batches against new targets before committing to a config.

Monitoring and Debugging: Did You Get Blocked or Throttled?

The status code tells you which retry strategy to use.

  • 403 Forbidden with captcha-delivery.com or datadome in the body: hard block. Don't retry the same proxy. Rotate, and if multiple proxies all 403, your delay distribution is the problem, not the IP.
  • 429 Too Many Requests: rate limit, not a behavioral block. Increase lambda by 50% and back off for 60 seconds.
  • 200 OK with a JS redirect or empty body: soft block. Check X-DataDome-Request-ID and X-DataDome-CID headers; if present, you're being shadow-served. The page renders but the data XHRs return empty.
code
def classify_response(response, html):
    headers = response.headers
    if response.status == 403 and "captcha-delivery" in html.lower():
        return "datadome_hard_block"
    if response.status == 429:
        return "rate_limit"
    if response.status == 200 and "X-DataDome-CID" in headers and len(html) < 5000:
        return "datadome_soft_block"
    if response.status == 200:
        return "ok"
    return f"unknown_{response.status}"

Log the X-DataDome-Request-ID on every response. When you open a support ticket with us (or with DataDome, if you're on their allow-list), that request ID is the only thing that lets anyone trace what fired. Without it, debugging is guesswork.

Metrics to track per 100 requests: success rate, mean delay, proxy rotation rate, credits per successful page. If success rate drops below your baseline by more than 10 points, the target updated their detection. If credits-per-success climbs without success rate dropping, your retry logic is wasting budget on transient errors.

Next steps

If you're scraping a single DataDome target at under 1,000 requests/hour, the Camoufox + residential + Poisson(1.5) setup above is the cheapest configuration that works at roughly 88-92%. Copy the code, swap your proxy pool, run a 100-request calibration, and tune lambda based on the table above.

If you're above that volume, or hitting financial / attestation-protected targets, do the cost math first. At 22 credits/page our pricing page tells you exactly what 10,000 pages costs. If the data is available through a partner API for less, take the API.

If you'd rather not run this stack yourself, DreamScrape's router does the engine selection, proxy rotation, and Poisson delay generation automatically. Free tier is 2,000 scrapes/month and DataDome targets route to Camoufox automatically — try a target on the playground before signing up.