Conversation
Deploying windmilldocs with
|
| Latest commit: |
2487ca0
|
| Status: | ✅ Deploy successful! |
| Preview URL: | https://d9d75878.windmilldocs.pages.dev |
| Branch Preview URL: | https://di-etl-bench.windmilldocs.pages.dev |
|
Claude finished @diegoimbert's task in 5m 27s —— View job Review summaryThe PR adds a TPC-DS benchmark section (a chart component plus a shared MDX partial) and embeds it into both the Ducklake docs page and the Day 2 launch-week blog post. It also drops 14 raw benchmark JSON result files into The chart math I sampled checks out (SF10 Snowflake Small 85.26s and Windmill 2× m6a.8xlarge 67.44s give the quoted 21% delta and the $0.04 vs $0.10 numbers; SF100 Windmill 3× m6a.16xlarge at 157s/$0.36 vs Snowflake Large 194s/$0.86 give the quoted 19%-faster / 2.4×-cheaper). The two looser claims are flagged inline. FindingsP1
P2 (see inline comments for the actionable ones)
|
| AIRFLOW_PANDAS_SF10_CONFIGS | ||
| } from '@site/src/components/TpcDsBenchmark'; | ||
|
|
||
| We ran the [TPC-DS](https://www.tpc.org/tpcds/) benchmark at three scale factors (10 GB, 100 GB, 1 TB) to compare Windmill + Ducklake against Airflow + Snowflake — the most common open-source orchestrator paired with a managed data warehouse. |
There was a problem hiding this comment.
P2 — em dash usage (AGENTS.md violation). AGENTS.md says: "Avoid using symbol '—' as it looks like obvious AI. Use sentences not using it or use '-' instead." This file introduces multiple em dashes (this line, plus L12, L24, L51 ×2, L83, L94, L101, L103). Suggest replacing with a period, parenthesis, or hyphen. Example for this line:
| We ran the [TPC-DS](https://www.tpc.org/tpcds/) benchmark at three scale factors (10 GB, 100 GB, 1 TB) to compare Windmill + Ducklake against Airflow + Snowflake — the most common open-source orchestrator paired with a managed data warehouse. | |
| We ran the [TPC-DS](https://www.tpc.org/tpcds/) benchmark at three scale factors (10 GB, 100 GB, 1 TB) to compare Windmill + Ducklake against Airflow + Snowflake, the most common open-source orchestrator paired with a managed data warehouse. |
| - **Windmill 1× m6a.16xlarge** (391s, $0.30) provides 2× better performance than Snowflake Small (711s) at similar cost | ||
| - At equal cost (~$0.30), Windmill is **2.7× faster** than Snowflake Small |
There was a problem hiding this comment.
P2 — looseness in benchmark claims.
- L68: "Windmill 1× m6a.16xlarge (391s, $0.30) provides 2× better performance than Snowflake Small (711s) at similar cost". The actual ratio is 711 / 391 ≈ 1.82×, not 2×; and Snowflake Small at 711.14s costs ~$0.79 (8 credit·hr ÷ 3600 × 711.14 with
2 credits/hr × $2/credit), which isn't "similar cost" to Windmill's $0.30. Either reframe (e.g. "nearly 2× faster" and remove "at similar cost") or use a config where the cost actually lines up. - L69: "At equal cost (~$0.30), Windmill is 2.7× faster than Snowflake Small" — this only works under a "throughput per dollar at fixed budget" interpretation (Windmill completes 100% of the benchmark for $0.30 vs Snowflake Small completing ~38% of it for the same budget). Worth spelling that out so a reader can reproduce the math.
| export const AIRFLOW_PANDAS_SF10_CONFIGS: BenchmarkConfig[] = [ | ||
| { | ||
| label: 'Airflow + Pandas (64 GB local)', | ||
| platform: 'windmill', // No cloud cost - local run | ||
| totalTimeSeconds: 2813.82, | ||
| stages: [ | ||
| { stage: 'ingest', wallClockSeconds: 600 }, | ||
| { stage: 'validate', wallClockSeconds: 168 }, | ||
| { stage: 'denormalize', wallClockSeconds: 323 }, | ||
| { stage: 'aggregate', wallClockSeconds: 433 }, | ||
| { stage: 'query', wallClockSeconds: 824 }, | ||
| { stage: 'verify', wallClockSeconds: 131 } | ||
| ] | ||
| }, |
There was a problem hiding this comment.
P2 — misleading platform tag. The Airflow + Pandas config sets platform: 'windmill'. It works only because instanceType is undefined, so AWS_PRICING[undefined] = 0 short-circuits calculateCost and returns 0, and the host MDX hides the cost table with showCostTable={false}. If anyone later passes showCostTable={true} or refactors calculateCost, this row will silently lie about cost.
Cleaner options:
- Add a
'local'(or'pandas') value to theplatformunion, branch incalculateCostto return 0 explicitly, and label tooltips/legend appropriately. - Or attach an explicit
cost?: numberoverride onBenchmarkConfigand prefer it incalculateCost.
Either way, the comment // No cloud cost - local run should not need to live next to a misleading platform: 'windmill'.
| @@ -0,0 +1,475 @@ | |||
| { | |||
There was a problem hiding this comment.
P2 — unused data files. All 14 JSON files added under src/data/tpcds_bench_results/ (~6.5k lines) are not referenced anywhere in the repo: TpcDsBenchmark.tsx ships hardcoded SF10_CONFIGS / SF100_CONFIGS / SF1000_CONFIGS / AIRFLOW_PANDAS_SF10_CONFIGS, and there is no loader that imports these JSONs.
Two reasonable resolutions:
- Wire them up. Replace the hardcoded configs with a small loader (e.g. import the JSONs and reduce them into
BenchmarkConfig[]). That makes future re-runs as easy as dropping in a new JSON. - Move out of
src/. If they're meant to be raw reference data only, put them under something likestatic/data/tpcds/orbenchmarks/so they aren't dragged into the bundler graph and don't bloatsrc/data/with content the site never reads.
Shipping ~9k lines of unreferenced data inside src/ makes future grep noisy and gives the impression the page is data-driven when it isn't.
| import TpcDsBenchmarkSection from '@site/src/components/TpcDsBenchmarkSection.mdx'; | ||
|
|
||
| <br /> | ||
| Most startups don't need a terabyte-scale data warehouse. If your data fits in 10 GB – 1 TB, you can run analytical workloads on Ducklake at a fraction of the cost of Airflow + Snowflake — while keeping full control over your data. |
There was a problem hiding this comment.
P2 — em dash (AGENTS.md rule). "of the cost of Airflow + Snowflake — while keeping full control" — replace the — with a comma or a period.
| Most startups don't need a terabyte-scale data warehouse. If your data fits in 10 GB – 1 TB, you can run analytical workloads on Ducklake at a fraction of the cost of Airflow + Snowflake — while keeping full control over your data. | |
| Most startups don't need a terabyte-scale data warehouse. If your data fits in 10 GB to 1 TB, you can run analytical workloads on Ducklake at a fraction of the cost of Airflow + Snowflake, while keeping full control over your data. |
|
Claude finished @diegoimbert's task in 55s —— View job Re-review (no new commits since prior review)This PR is at the same commit ( Recap of the still-open P2 findings from the prior pass (none are P0/P1 blockers):
The chart math I sampled before still checks out (SF10 Snowflake Small 85.26s vs Windmill 2× m6a.8xlarge 67.44s → quoted 21% / $0.04 vs $0.10; SF100 Windmill 3× m6a.16xlarge 157s/$0.36 vs Snowflake Large 194s/$0.86 → quoted 19%-faster / 2.4×-cheaper). If you'd like me to actually apply these P2 fixes (em dashes + bold restraint + reword the loose benchmark claims + relabel the Pandas platform), reply with |
|
fixed ai feedback |
|
Claude finished @diegoimbert's task in 1m 42s —— View job Re-review (commit
|
| # | Finding | Status |
|---|---|---|
| 1 | Em dashes in TpcDsBenchmarkSection.mdx and the blog post |
✅ Fixed — grep '—' returns nothing in the changed files |
| 2 | Bold misuse in the new partial (**52 tasks**, **Ingest**, **67 seconds**, …) |
✅ Fixed — grep '**' returns nothing in TpcDsBenchmarkSection.mdx |
| 3 | Loose benchmark wording at L68-69 ("2× … at similar cost", "2.7× faster at equal cost") | ✅ Fixed — now reads "1.8x faster than Snowflake Small (711s, $0.79)" and "Per dollar spent, Windmill delivers 2.7x more throughput than Snowflake Small". The math checks out: 711/391 = 1.82, and (1 / $0.30) ÷ (1 / $0.79) = 2.63 ≈ 2.7×. The honest $0.79 disclosure for Snowflake Small replaces the earlier "similar cost" framing. |
| 4 | Misleading platform: 'windmill' on the Pandas row (TpcDsBenchmark.tsx:452-465) |
✅ Fixed — platform union extended to 'windmill' | 'snowflake' | 'local', the Pandas config relabelled to platform: 'local', the bar-color branch now paints 'local' red, and the $/hour table cell explicitly returns 0 for 'local'. calculateCost falls through to return 0 for the new value, which is correct (no instanceType, no warehouseSize). |
| 5 | 9k lines of unreferenced JSON under src/data/tpcds_bench_results/ |
grep -rn "tpcds_bench_results" src blog docs still returns nothing. Not a blocker; just flagging that #5 from the prior pass is the only one still open. |
New findings introduced by this commit
- None. The commit is a pure cleanup pass; no new prose, no new chart configs, no new pages or sidebar entries to validate.
AGENTS.md / writing_style_guide.md compliance check (delta)
- ✅ Em-dash rule now satisfied across the changed files.
- ✅ Bold-restraint rule now satisfied in the new partial.
- ✅ Sentence-case heading rule satisfied (the previously-flagged
title="TPC-DS SF10 (10 GB) — Pandas vs Ducklake"is now"TPC-DS SF10 (10 GB): Pandas vs Ducklake", no em dash, no title case). - ✅ All previously-verified rules from the original pass still hold (code-block languages, no HTML colour styling, relative internal links, frontmatter
descriptionpresent on host pages, no new EE references introduced).
Verification
npm run buildshould still pass — diff is content-only inside an already-rendering MDX partial plus a small TS component change that compiles (the'local'value is added consistently to the union, the bar-color callback, and the$/hourbranch).- Worth a manual visual check after merge: confirm the Pandas bar in the SF10 "Pandas vs Ducklake" chart now renders red and the rest stay blue/light-blue, in both light and dark mode.
LGTM from a docs-review standpoint. Optional: tidy up the unused src/data/tpcds_bench_results/ JSON files (wire them up or move them to static//benchmarks/) in a follow-up.
- Diff
origin/main...HEAD(and against prior reviewed commit47fc4e5) - Verify P2 fixes from prior review (em dashes, bold restraint, loose benchmark wording, platform tag, unused JSON)
- Spot-check chart math + frontmatter
- Post updated review
· branchdi/etl-bench
|
JSON files are for us |
|
/ai can you break down the complexity of the additions in this PR (e.g database/backend roundtrips, O(n) etc) |
|
Claude finished @diegoimbert's task —— View job Complexity Analysis of PR AdditionsTodo List
Complexity BreakdownDatabase/Backend OperationsNo database or backend roundtrips. The component is entirely client-side with:
Time Complexity: O(n × m)Where:
Key operations:
Space Complexity: O(n × m)
Runtime Performance
Bundle Impact
SummaryThis is a purely presentational component with linear complexity relative to the number of benchmarks displayed. No async operations, no backend calls, no data fetching. The heaviest operation is Chart.js rendering the stacked bar chart, which is O(n × m) but with small constants (n ≤ 6, m ≤ 7 in practice). |
See preview at /blog/launch-week-data-tables-ducklake
Also in the Ducklake docs