Add nonclustered indexes for query/procedure/query store lookups#859
Add nonclustered indexes for query/procedure/query store lookups#859erikdarlingdata merged 1 commit intodevfrom
Conversation
Phase 3 OUTER APPLY hydration of compressed query_text/plan_text was forcing an Eager Index Spool over the full collect.query_stats table (and similar for procedure_stats / query_store_data), which took 104 seconds on a 742K-row table in #835. Changes: - Remove CONVERT(binary(8), nvarchar-hash, 1) anti-pattern from OUTER APPLY WHERE clauses by keeping query_hash as native binary(8) in temp tables. query_hash is only converted to nvarchar(20) in the final output projection. - Add three nonclustered indexes (install script and upgrade script): IX_query_stats_hash_lookup (query_hash, database_name, collection_time DESC) IX_procedure_stats_name_lookup (database_name, schema_name, object_name, collection_time DESC) IX_query_store_data_id_lookup (database_name, query_id, collection_time DESC) - Indexes use SORT_IN_TEMPDB = ON and DATA_COMPRESSION = PAGE. - ONLINE = ON is applied conditionally via dynamic SQL based on SERVERPROPERTY('EngineEdition') — Enterprise/Developer/Azure only, since Standard/Web/Express don't support online index operations. Tested against CADelete's 742K-row table: Phase 3 went from 104s to well under 1s (5s total for the full three-phase query). Fixes #835 Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
erikdarlingdata
left a comment
There was a problem hiding this comment.
Review summary
What the PR does: Fixes #835 (Phase 3 OUTER APPLY builds a 104s Eager Index Spool). Two parts: (1) remove the CONVERT(binary(8), tr.query_hash, 1) from the OUTER APPLY WHERE clauses by keeping query_hash native binary(8) through Phases 2–3 and only converting to nvarchar(20) in the final SELECT; (2) add three supporting nonclustered indexes on collect.query_stats, collect.procedure_stats, and collect.query_store_data, with ONLINE = ON gated on EngineEdition.
What's good
- Targets
dev, notmain. - Schema change lands in both
install/02_create_tables.sql(new installs) andupgrades/2.7.0-to-2.8.0/01_add_query_lookup_indexes.sql(existing installs), withupgrade.txtlisting the one file. Correct pattern per the upgrade folder rules — no install-script ALTERs hiding a missing upgrade script. - Upgrade script follows
_template.sql(SET options,USE PerformanceMonitor,IF NOT EXISTSwithSELECT 1/0, idempotent). - SARGability fix is tight: the
query_hashcolumn staysbinary(8)on both sides of the OUTER APPLY predicate, index key order(query_hash, database_name, collection_time DESC)matches the lookup + ORDER BY exactly, and the fix is applied to both the grid query (GetQueryStatsAsync) and the MCP equivalent in the same file. EngineEdition IN (3, 5, 8)correctly covers Enterprise/Developer/Azure SQL DB/MI forONLINE = ON.- No PlanAnalyzer changes — no Dashboard/Lite sync concern. Not a Lite-first violation: these are SQL Server DDL for the server backend, which Lite (DuckDB) doesn't share.
What to look at
- See inline nits on batch separation in the install script,
DATA_COMPRESSION = PAGEimplicitly requiring SQL 2016 SP1+ on Standard, and missing trailingGOin the upgrade script. - PR body flags two unchecked test items: upgrade path from 2.7.0 against a populated DB, and plan verification at scale without an index hint. Worth closing both before merge — the second item is the whole point of the fix.
upgrades/README.mdVersion History stops at 2.0.0 and doesn't mention 2.8.0; pre-existing staleness, not introduced here, but adjacent to this change.- No test coverage added. Acceptable for pure DDL/SARGability work — nothing clean to unit-test — flagging per routine.
Generated by Claude Code
| ONLINE = ON is only supported on Enterprise/Developer/Azure editions. The | ||
| options string is built dynamically based on SERVERPROPERTY('EngineEdition'). | ||
| */ | ||
| DECLARE @online_option nvarchar(20) = |
There was a problem hiding this comment.
Nit: consider a GO before this DECLARE to place it in its own batch. The preceding tables are wrapped in IF NOT EXISTS … BEGIN … END; blocks without intervening GOs, so @online_option/@index_sql land in whatever batch the install runner is composing. Not a correctness issue today (no name collisions), but isolating DDL-generation variables keeps future edits safer.
Generated by Claude Code
| THEN N', ONLINE = ON' | ||
| ELSE N'' | ||
| END; | ||
|
|
There was a problem hiding this comment.
EngineEdition 3 covers Enterprise and Developer (they share the value), 5 = Azure SQL Database, 8 = Managed Instance. That matches where ONLINE = ON is available. DATA_COMPRESSION = PAGE is unconditional here — fine for the supported matrix (SQL 2016 SP1+ made row/page compression a Standard feature), just flagging that this raises the minimum to SP1 for Standard installs.
Generated by Claude Code
| (SORT_IN_TEMPDB = ON, DATA_COMPRESSION = PAGE' + @online_option + N');'; | ||
| EXEC sys.sp_executesql @index_sql; | ||
| PRINT 'Created collect.query_store_data.IX_query_store_data_id_lookup index'; | ||
| END; |
There was a problem hiding this comment.
Script is missing a trailing newline and a final GO. Every other upgrade script in this repo ends the last batch with a GO before EOF; the installer's batch splitter should still work without it, but match the house style from _template.sql.
Generated by Claude Code
| query_text = CAST(DECOMPRESS(qs2.query_text) AS nvarchar(max)) | ||
| FROM collect.query_stats AS qs2 | ||
| WHERE qs2.query_hash = CONVERT(binary(8), tr.query_hash, 1) | ||
| WHERE qs2.query_hash = tr.query_hash |
There was a problem hiding this comment.
Fix looks right: tr.query_hash now stays binary(8) through Phase 2/3 and only converts to nvarchar(20) in the final projection (line 1183), so qs2.query_hash = tr.query_hash is a native binary = binary comparison and sargable against the new IX_query_stats_hash_lookup. The symmetric fix is applied in the MCP copy of the query at lines 4441/4484. No other OUTER APPLYs in this file regressed (the remaining CONVERT(binary(8), @param, 1) forms at 2754/2923 are parameter→constant conversions, still sargable).
Generated by Claude Code
Summary
CONVERT(binary(8), nvarchar, 1)from OUTER APPLY WHERE clauses in the Phase 3 hydration queries (GetQueryStatsAsync+ MCP equivalent).query_hashnow stays as nativebinary(8)through the temp tables and is only converted tonvarchar(20)in the final output.IX_query_stats_hash_lookup (query_hash, database_name, collection_time DESC)IX_procedure_stats_name_lookup (database_name, schema_name, object_name, collection_time DESC)IX_query_store_data_id_lookup (database_name, query_id, collection_time DESC)SORT_IN_TEMPDB = ONandDATA_COMPRESSION = PAGE.ONLINE = ONis applied conditionally via dynamic SQL based onSERVERPROPERTY('EngineEdition')— Enterprise/Developer/Azure only.install/02_create_tables.sql(new installs) andupgrades/2.7.0-to-2.8.0/01_add_query_lookup_indexes.sql(existing installs).Background
CADelete's execution plan (#835) showed a 104-second Eager Index Spool on the Phase 3 OUTER APPLY. No nonclustered index existed to service the lookup by
(query_hash, database_name), so SQL Server built a temporary index over the entire 781K-row table on every query.After testing locally and on CADelete's server: Phase 3 dropped from 104s to effectively 0 (key lookups for 500 rows). Total query went from 1m 37s to 5s on their 742K-row table.
Fixes #835
Test plan
IF NOT EXISTS)🤖 Generated with Claude Code