From 2758526cdbe61a5ca2cee79b4f2f546f5ff8f642 Mon Sep 17 00:00:00 2001
From: ulziibay-kernel <253135130+ulziibay-kernel@users.noreply.github.com>
Date: Mon, 11 May 2026 17:25:21 +0000
Subject: [PATCH 1/4] Add tiered site difficulty index to FAQ
Replaces flat unsupported-websites list with a five-tier index covering very-hard through very-easy targets, with framing on how to interpret the tiers and a pointer to manual baselining.
---
browsers/faq.mdx | 62 +++++++++++++++++++++++++++++++++++++++++++++---
1 file changed, 59 insertions(+), 3 deletions(-)
diff --git a/browsers/faq.mdx b/browsers/faq.mdx
index 47651e5..cd97986 100644
--- a/browsers/faq.mdx
+++ b/browsers/faq.mdx
@@ -19,13 +19,69 @@ If you're experiencing slower-than-expected browser creation times, review your
- Browsers persist independently of CDP. Depending on your timeout configuration, it will continue running even if the CDP connection closes. You can reconnect to the same `cdp_ws_url` if you're unexpectedly disconnected.
- We recommend implementing reconnect logic, as network interruptions or lifecycle events can cause CDP sessions to close. Detect disconnects and automatically re-establish a CDP connection when this occurs.
-## Unsupported Websites
+## Site difficulty index
-There are some websites that are not supported by Kernel browsers due to their restrictions around automation and associated bot detection. These include:
+Not all websites are equally hard to automate. The tiers below reflect how much friction we typically see when running Kernel browsers against each site — from "works out of the box" to "expect blocks even with stealth mode and a clean residential IP."
+
+This list is incomplete and will grow as we test more targets. Difficulty also shifts over time as sites change their defenses, so treat these as a starting point — always [run a manual baseline](/browsers/bot-detection/overview#getting-started) before automating.
+
+### Tier 5 — Very hard
+
+Aggressive anti-automation. Login and at-scale scraping are routinely blocked even with stealth mode, residential proxies, and warmed profiles. Expect hard blocks, account locks, or shadow bans.
- LinkedIn
- Facebook
- Instagram
+- TikTok
+- Zillow
+- Facebook Marketplace
+
+### Tier 4 — Hard
+
+Sophisticated fingerprinting and behavioral analysis. Reachable with stealth mode + careful pacing, but high CAPTCHA pressure and frequent challenges. Persistent [profiles](/auth/profiles) and stable IPs materially improve pass rates.
+
- X (Twitter)
-- Amazon
+- Google Search
+- Google Maps
+- Amazon (logged-in flows)
+- Booking.com
+- Airbnb
+- Glassdoor
+- Walmart
+
+### Tier 3 — Medium
+
+Real detection in place, but workable with Kernel defaults. Watch request frequency and avoid headless mode.
+
- Reddit
+- YouTube
+- Indeed
+- Yelp
+- Pinterest
+- Target
+- TripAdvisor
+- Crunchbase
+
+### Tier 2 — Easy
+
+Light protections. Most automations succeed with default Kernel settings; occasional rate limiting at scale.
+
+- eBay
+- Etsy
+- Medium
+- IMDb
+- Cars.com
+- Shopify storefronts
+
+### Tier 1 — Very easy
+
+Minimal or no anti-bot measures. Suitable for unrestricted automation under each site's terms.
+
+- Wikipedia
+- GitHub
+- Yahoo Finance
+- Yellow Pages
+
+
+ Hitting friction on a site that should be easier than this list suggests? Check your [proxy type](/browsers/bot-detection/overview#choosing-a-proxy-type) and confirm you're not running headless — those are the two most common causes of unexpected detection.
+
From 6c3d853af8a8e423123f1f31c944daba1941b406 Mon Sep 17 00:00:00 2001
From: ulziibay-kernel <253135130+ulziibay-kernel@users.noreply.github.com>
Date: Mon, 11 May 2026 17:44:08 +0000
Subject: [PATCH 2/4] Replace tier guesses with measured block rates
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
Reframes the site difficulty index around an N=5 stealth + US residential proxy test against each site's public homepage. Three groups (Hard / Light / Clear) ranked by observed block + challenge rate, with detection vendor noted per site. Adds a methodology section and explicit caveats that this is a floor, not a ceiling — login flows and at-scale behavior are out of scope for this benchmark.
---
browsers/faq.mdx | 75 +++++++++++++++++-------------------------------
1 file changed, 27 insertions(+), 48 deletions(-)
diff --git a/browsers/faq.mdx b/browsers/faq.mdx
index cd97986..a2eb726 100644
--- a/browsers/faq.mdx
+++ b/browsers/faq.mdx
@@ -21,67 +21,46 @@ If you're experiencing slower-than-expected browser creation times, review your
## Site difficulty index
-Not all websites are equally hard to automate. The tiers below reflect how much friction we typically see when running Kernel browsers against each site — from "works out of the box" to "expect blocks even with stealth mode and a clean residential IP."
+Block rates for unauthenticated homepage visits from a stealth Kernel browser through a US residential proxy. Sites are sorted by observed difficulty. See [methodology](#methodology) for the test protocol and important caveats — in particular, these numbers reflect a single landing-page request, not login flows or at-scale scraping.
-This list is incomplete and will grow as we test more targets. Difficulty also shifts over time as sites change their defenses, so treat these as a starting point — always [run a manual baseline](/browsers/bot-detection/overview#getting-started) before automating.
+This list is incomplete and will grow as we run more tests. Last measured 2026-05-11.
-### Tier 5 — Very hard
+### Hard — significant friction observed
-Aggressive anti-automation. Login and at-scale scraping are routinely blocked even with stealth mode, residential proxies, and warmed profiles. Expect hard blocks, account locks, or shadow bans.
+| Site | Block rate | Detection vendor |
+|------|-----------:|------------------|
+| Yelp | 100% (5/5 blocked) | DataDome |
+| Glassdoor | 100% (5/5 challenged) | Cloudflare |
+| Indeed | 40% (2/5 challenged) | Cloudflare + Imperva |
+| TripAdvisor | 40% (2/5 blocked) | DataDome |
-- LinkedIn
-- Facebook
-- Instagram
-- TikTok
-- Zillow
-- Facebook Marketplace
+### Light — partial friction observed
-### Tier 4 — Hard
+| Site | Block rate | Detection vendor |
+|------|-----------:|------------------|
+| Yellow Pages | 20% (1/5 blocked) | Cloudflare |
+| Zillow | 20% (1/5 challenged) | PerimeterX |
-Sophisticated fingerprinting and behavioral analysis. Reachable with stealth mode + careful pacing, but high CAPTCHA pressure and frequent challenges. Persistent [profiles](/auth/profiles) and stable IPs materially improve pass rates.
+### Clear — no blocks observed at this layer
-- X (Twitter)
-- Google Search
-- Google Maps
-- Amazon (logged-in flows)
-- Booking.com
-- Airbnb
-- Glassdoor
-- Walmart
+All five sessions returned a usable page. These sites still deploy bot detection — login flows, deep navigation, and high-volume scraping behave very differently — but the public landing page renders cleanly.
-### Tier 3 — Medium
+Airbnb, Amazon, Booking.com, Cars.com, Crunchbase, eBay, Etsy, Facebook, Facebook Marketplace, GitHub, Google Maps, Google Search, IMDb, Instagram, LinkedIn, Medium, Pinterest, Reddit, Shopify storefronts (Gymshark), Target, TikTok, Walmart, Wikipedia, X (Twitter), Yahoo Finance, YouTube.
-Real detection in place, but workable with Kernel defaults. Watch request frequency and avoid headless mode.
+### Methodology
-- Reddit
-- YouTube
-- Indeed
-- Yelp
-- Pinterest
-- Target
-- TripAdvisor
-- Crunchbase
+For each site, we open 5 concurrent stealth Kernel browser sessions through a US residential proxy and navigate to the public landing URL (e.g. `https://www.linkedin.com`). Each session uses a different exit IP. We then classify the response:
-### Tier 2 — Easy
+- **Success** — the expected page rendered, no detection signals tripped.
+- **Challenged** — a visible CAPTCHA or "checking your browser" interstitial that requires action to proceed (e.g. Cloudflare Turnstile, hCaptcha, DataDome captcha).
+- **Blocked** — a hard block page, 403/429 status, or vendor-branded "Access Denied" response.
-Light protections. Most automations succeed with default Kernel settings; occasional rate limiting at scale.
+Block rate combines blocked + challenged. Vendor labels reflect the bot-detection product whose signatures we matched.
-- eBay
-- Etsy
-- Medium
-- IMDb
-- Cars.com
-- Shopify storefronts
-
-### Tier 1 — Very easy
-
-Minimal or no anti-bot measures. Suitable for unrestricted automation under each site's terms.
-
-- Wikipedia
-- GitHub
-- Yahoo Finance
-- Yellow Pages
+
+ These results are a floor, not a ceiling. They tell you what the *easiest* automation case — one anonymous homepage visit — looks like. A site that scores 0% here can still be very hard once you add login, repeated requests from the same IP, deep navigation, or large concurrency. We plan to publish login-flow and at-scale benchmarks separately.
+
- Hitting friction on a site that should be easier than this list suggests? Check your [proxy type](/browsers/bot-detection/overview#choosing-a-proxy-type) and confirm you're not running headless — those are the two most common causes of unexpected detection.
+ Hitting friction on a site that scored clean here? Check your [proxy type](/browsers/bot-detection/overview#choosing-a-proxy-type) and confirm you're not running headless — those are the two most common causes of unexpected detection.
From d1565a9aaef70a5efac5de3fb2ac0cc7079826d5 Mon Sep 17 00:00:00 2001
From: ulziibay-kernel <253135130+ulziibay-kernel@users.noreply.github.com>
Date: Mon, 11 May 2026 17:46:18 +0000
Subject: [PATCH 3/4] Simplify difficulty index: drop per-site block-rate
numbers
Keeps Hard / Light / Clear grouping but drops the percentage tables in favor of plain lists. The methodology section still describes how sites get bucketed.
---
browsers/faq.mdx | 51 ++++++++++++++++++++++++++++++++++++------------
1 file changed, 38 insertions(+), 13 deletions(-)
diff --git a/browsers/faq.mdx b/browsers/faq.mdx
index a2eb726..cf5ab10 100644
--- a/browsers/faq.mdx
+++ b/browsers/faq.mdx
@@ -27,25 +27,50 @@ This list is incomplete and will grow as we run more tests. Last measured 2026-0
### Hard — significant friction observed
-| Site | Block rate | Detection vendor |
-|------|-----------:|------------------|
-| Yelp | 100% (5/5 blocked) | DataDome |
-| Glassdoor | 100% (5/5 challenged) | Cloudflare |
-| Indeed | 40% (2/5 challenged) | Cloudflare + Imperva |
-| TripAdvisor | 40% (2/5 blocked) | DataDome |
+Most or all sessions were blocked or challenged.
+
+- Yelp
+- Glassdoor
+- Indeed
+- TripAdvisor
### Light — partial friction observed
-| Site | Block rate | Detection vendor |
-|------|-----------:|------------------|
-| Yellow Pages | 20% (1/5 blocked) | Cloudflare |
-| Zillow | 20% (1/5 challenged) | PerimeterX |
+Some sessions were blocked or challenged.
-### Clear — no blocks observed at this layer
+- Yellow Pages
+- Zillow
-All five sessions returned a usable page. These sites still deploy bot detection — login flows, deep navigation, and high-volume scraping behave very differently — but the public landing page renders cleanly.
+### Clear — no blocks observed at this layer
-Airbnb, Amazon, Booking.com, Cars.com, Crunchbase, eBay, Etsy, Facebook, Facebook Marketplace, GitHub, Google Maps, Google Search, IMDb, Instagram, LinkedIn, Medium, Pinterest, Reddit, Shopify storefronts (Gymshark), Target, TikTok, Walmart, Wikipedia, X (Twitter), Yahoo Finance, YouTube.
+All sessions returned a usable page. These sites still deploy bot detection — login flows, deep navigation, and high-volume scraping behave very differently — but the public landing page renders cleanly.
+
+- Airbnb
+- Amazon
+- Booking.com
+- Cars.com
+- Crunchbase
+- eBay
+- Etsy
+- Facebook
+- Facebook Marketplace
+- GitHub
+- Google Maps
+- Google Search
+- IMDb
+- Instagram
+- LinkedIn
+- Medium
+- Pinterest
+- Reddit
+- Shopify storefronts (Gymshark)
+- Target
+- TikTok
+- Walmart
+- Wikipedia
+- X (Twitter)
+- Yahoo Finance
+- YouTube
### Methodology
From c55e4a66c38722e6e55c0a794f9f5c97fe445275 Mon Sep 17 00:00:00 2001
From: ulziibay-kernel <253135130+ulziibay-kernel@users.noreply.github.com>
Date: Mon, 11 May 2026 17:46:49 +0000
Subject: [PATCH 4/4] Drop methodology section from difficulty index
---
browsers/faq.mdx | 20 ++------------------
1 file changed, 2 insertions(+), 18 deletions(-)
diff --git a/browsers/faq.mdx b/browsers/faq.mdx
index cf5ab10..6836d97 100644
--- a/browsers/faq.mdx
+++ b/browsers/faq.mdx
@@ -21,9 +21,7 @@ If you're experiencing slower-than-expected browser creation times, review your
## Site difficulty index
-Block rates for unauthenticated homepage visits from a stealth Kernel browser through a US residential proxy. Sites are sorted by observed difficulty. See [methodology](#methodology) for the test protocol and important caveats — in particular, these numbers reflect a single landing-page request, not login flows or at-scale scraping.
-
-This list is incomplete and will grow as we run more tests. Last measured 2026-05-11.
+A rough grouping of sites by how much friction we see when running stealth Kernel browsers against the public landing page. This list is incomplete and will grow over time.
### Hard — significant friction observed
@@ -72,20 +70,6 @@ All sessions returned a usable page. These sites still deploy bot detection —
- Yahoo Finance
- YouTube
-### Methodology
-
-For each site, we open 5 concurrent stealth Kernel browser sessions through a US residential proxy and navigate to the public landing URL (e.g. `https://www.linkedin.com`). Each session uses a different exit IP. We then classify the response:
-
-- **Success** — the expected page rendered, no detection signals tripped.
-- **Challenged** — a visible CAPTCHA or "checking your browser" interstitial that requires action to proceed (e.g. Cloudflare Turnstile, hCaptcha, DataDome captcha).
-- **Blocked** — a hard block page, 403/429 status, or vendor-branded "Access Denied" response.
-
-Block rate combines blocked + challenged. Vendor labels reflect the bot-detection product whose signatures we matched.
-
-
- These results are a floor, not a ceiling. They tell you what the *easiest* automation case — one anonymous homepage visit — looks like. A site that scores 0% here can still be very hard once you add login, repeated requests from the same IP, deep navigation, or large concurrency. We plan to publish login-flow and at-scale benchmarks separately.
-
-
- Hitting friction on a site that scored clean here? Check your [proxy type](/browsers/bot-detection/overview#choosing-a-proxy-type) and confirm you're not running headless — those are the two most common causes of unexpected detection.
+ Hitting friction on a site that's listed under Clear? Check your [proxy type](/browsers/bot-detection/overview#choosing-a-proxy-type) and confirm you're not running headless — those are the two most common causes of unexpected detection.