Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
15 commits
Select commit Hold shift + click to select a range
9f8dc72
fix: use synchronous DuckDB constructor to avoid bun runtime timeout
suryaiyer95 Apr 2, 2026
eaa10b7
revert: restore async DuckDB constructor — sync change was bogus
suryaiyer95 Apr 2, 2026
d110d6e
feat: add MSSQL/Fabric dialect mapping and data-parity support
suryaiyer95 Apr 6, 2026
3e6b3e0
feat: add Azure AD authentication to SQL Server driver (7 flows)
suryaiyer95 Apr 6, 2026
54aceed
docs: add MSSQL and Microsoft Fabric documentation to data-parity SKI…
suryaiyer95 Apr 6, 2026
1056c64
fix: delegate Azure AD credential creation to tedious and remove unde…
suryaiyer95 Apr 7, 2026
bfb1295
fix: upgrade `mssql` to v12 with `ConnectionPool` isolation and row f…
suryaiyer95 Apr 13, 2026
32d4afc
fix: resolve TypeScript spread-type errors in Azure AD conditional op…
suryaiyer95 Apr 13, 2026
fda536d
fix: resolve cubic review findings on MSSQL/Fabric PR
suryaiyer95 Apr 14, 2026
d004e1b
test: add fabric connection path and flattenRow coverage
suryaiyer95 Apr 14, 2026
b69a3d2
docs: document minimum versions and make @azure/identity optional
suryaiyer95 Apr 14, 2026
d1cdd1b
fix: acquire Azure AD tokens directly to bypass Bun browser-bundle re…
suryaiyer95 Apr 16, 2026
173d32f
fix: auto-acquire Azure AD token for `azure-active-directory-access-t…
suryaiyer95 Apr 16, 2026
63769f4
fix: side-aware CTE injection for cross-warehouse `data_diff` SQL-que…
suryaiyer95 Apr 17, 2026
1977232
chore: regenerate `bun.lock` to match drivers `peerDependencies` layout
suryaiyer95 Apr 17, 2026
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
66 changes: 66 additions & 0 deletions .opencode/skills/data-parity/SKILL.md
Original file line number Diff line number Diff line change
Expand Up @@ -71,6 +71,19 @@ WHERE table_schema = 'mydb' AND table_name = 'orders'
ORDER BY ordinal_position
```

```sql
-- SQL Server / Fabric
SELECT c.name AS column_name, tp.name AS data_type, c.is_nullable,
dc.definition AS column_default
FROM sys.columns c
INNER JOIN sys.types tp ON c.user_type_id = tp.user_type_id
INNER JOIN sys.objects o ON c.object_id = o.object_id
INNER JOIN sys.schemas s ON o.schema_id = s.schema_id
LEFT JOIN sys.default_constraints dc ON c.default_object_id = dc.object_id
WHERE s.name = 'dbo' AND o.name = 'orders'
ORDER BY c.column_id
```

```sql
-- ClickHouse
DESCRIBE TABLE source_db.events
Expand Down Expand Up @@ -409,3 +422,56 @@ Even when tables match perfectly, state what was checked:

**Silently excluding auto-timestamp columns without asking the user**
→ Always present detected auto-timestamp columns (Step 4) and get explicit confirmation. In migration scenarios, `created_at` should be *identical* — excluding it silently hides real bugs.

---

## SQL Server and Microsoft Fabric

### Minimum Version Requirements

| Component | Minimum Version | Why |
|---|---|---|
| **SQL Server** | 2022 (16.x) | `DATETRUNC()` used for date partitioning; `LEAST()`/`GREATEST()` used by Rust engine |
| **Azure SQL Database** | Any current version | Always has `DATETRUNC()` and `LEAST()` |
| **Microsoft Fabric** | Any current version | T-SQL surface includes all required functions |
| **mssql** (npm) | 12.0.0 | `ConnectionPool` isolation for concurrent connections, tedious 19 |
| **@azure/identity** (npm) | 4.0.0 | Required only for Azure AD authentication; tedious imports it internally |

> **Note:** Date partitioning (`partition_column` + `partition_granularity`) uses `DATETRUNC()` which is **not available on SQL Server 2019 or earlier**. Basic diff operations (joindiff, hashdiff, profile) work on older versions. If you need partitioned diffs on SQL Server < 2022, use numeric or categorical partitioning instead.

### Supported Configurations

| Warehouse Type | Authentication | Notes |
|---|---|---|
| `sqlserver` / `mssql` | User/password or Azure AD | On-prem or Azure SQL. SQL Server 2022+ required for date partitioning. |
| `fabric` | Azure AD only | Microsoft Fabric SQL endpoint. Always uses TLS encryption. |

### Connecting to Microsoft Fabric

Fabric uses the same TDS protocol as SQL Server — no separate driver needed. Configuration:

```
type: "fabric"
host: "<workspace-id>-<item-id>.datawarehouse.fabric.microsoft.com"
database: "<warehouse-name>"
authentication: "azure-active-directory-default" # recommended
```
Comment on lines +453 to +458
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Add yaml (or text) language to the fenced block.

markdownlint MD040 flags missing language. Since the block is a pseudo-YAML config example, yaml reads best:

💚 Fix
-```
+```yaml
 type: "fabric"
 host: "<workspace-id>-<item-id>.datawarehouse.fabric.microsoft.com"
 database: "<warehouse-name>"
 authentication: "azure-active-directory-default"   # recommended
</details>

<!-- suggestion_start -->

<details>
<summary>📝 Committable suggestion</summary>

> ‼️ **IMPORTANT**
> Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

```suggestion

🧰 Tools
🪛 markdownlint-cli2 (0.22.0)

[warning] 453-453: Fenced code blocks should have a language specified

(MD040, fenced-code-language)

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In @.opencode/skills/data-parity/SKILL.md around lines 453 - 458, The fenced
code block showing the pseudo-YAML config (the block containing type: "fabric",
host: "<workspace-id>-<item-id>.datawarehouse.fabric.microsoft.com", database:
"<warehouse-name>", authentication: "azure-active-directory-default") is missing
a language tag; change the opening fence from ``` to ```yaml (or ```text) so
markdownlint MD040 is satisfied and the block is treated as YAML.


Auth shorthands (mapped to full tedious type names):
- `CLI` or `default` → `azure-active-directory-default`
- `password` → `azure-active-directory-password`
- `service-principal` → `azure-active-directory-service-principal-secret`
- `msi` or `managed-identity` → `azure-active-directory-msi-vm`

Full Azure AD authentication types:
- `azure-active-directory-default` — auto-discovers credentials via `DefaultAzureCredential` (recommended; works with `az login`)
- `azure-active-directory-password` — username/password with `azure_client_id` and `azure_tenant_id`
- `azure-active-directory-access-token` — pre-obtained token (does **not** auto-refresh)
- `azure-active-directory-service-principal-secret` — service principal with `azure_client_id`, `azure_client_secret`, `azure_tenant_id`
- `azure-active-directory-msi-vm` / `azure-active-directory-msi-app-service` — managed identity

### Algorithm Behavior

- **Same-warehouse** MSSQL or Fabric → `joindiff` (single FULL OUTER JOIN, most efficient)
- **Cross-warehouse** MSSQL/Fabric ↔ other database → `hashdiff` (automatic when using `auto`)
- The Rust engine maps `sqlserver`/`mssql` to `tsql` dialect and `fabric` to `fabric` dialect — both generate valid T-SQL syntax with bracket quoting (`[schema].[table]`).
30 changes: 24 additions & 6 deletions bun.lock

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

10 changes: 9 additions & 1 deletion packages/drivers/package.json
Original file line number Diff line number Diff line change
Expand Up @@ -17,10 +17,18 @@
"@google-cloud/bigquery": "^8.0.0",
"@databricks/sql": "^1.0.0",
"mysql2": "^3.0.0",
"mssql": "^11.0.0",
"mssql": "^12.0.0",
"oracledb": "^6.0.0",
"duckdb": "^1.0.0",
"mongodb": "^6.0.0",
"@clickhouse/client": "^1.0.0"
},
"peerDependencies": {
"@azure/identity": ">=4.0.0"
},
"peerDependenciesMeta": {
"@azure/identity": {
"optional": true
}
}
Comment on lines +20 to 33
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

🧩 Analysis chain

🌐 Web query:

Does mssql 12.x on npm bundle or peer-depend on @azure/identity, and what is the current minimum @azure/identity version for its azure-active-directory-* auth types?

💡 Result:

No, the mssql 12.x npm package does not bundle or peer-depend on @azure/identity. It depends on tedious ^19.0.0, which bundles @azure/identity ^4.2.1 as a regular dependency. For azure-active-directory-* auth types (such as azure-active-directory-default, azure-active-directory-password, azure-active-directory-access-token, azure-active-directory-msi-vm, azure-active-directory-msi-app-service, azure-active-directory-service-principal-secret), tedious handles the authentication internally using its bundled @azure/identity, so no separate installation or specific minimum version of @azure/identity is required by the user. The effective minimum version is whatever tedious 19.x requires, which is ^4.2.1.

Citations:


Remove @azure/identity from peerDependencies — it's redundant and already bundled by tedious.

mssql v12 depends on tedious ^19.0.0, which bundles @azure/identity ^4.2.1 as a regular dependency. Adding @azure/identity as an optional peer dependency in drivers/package.json is unnecessary; tedious handles Azure AD authentication internally without user intervention. Users installing mssql automatically get the correct @azure/identity version through the tedious dependency chain.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@packages/drivers/package.json` around lines 20 - 33, Remove the redundant
optional peer dependency entry for "@azure/identity" from package.json: delete
the "@azure/identity" key inside the "peerDependencies" object and its
corresponding entry in "peerDependenciesMeta" since "mssql"/"tedious" already
bundles it; update the package.json so only the other dependencies (e.g.,
"mssql", "oracledb", "duckdb", "mongodb", "@clickhouse/client") remain and no
"@azure/identity" references exist under "peerDependencies" or
"peerDependenciesMeta".

}
6 changes: 6 additions & 0 deletions packages/drivers/src/normalize.ts
Original file line number Diff line number Diff line change
Expand Up @@ -65,6 +65,11 @@ const SQLSERVER_ALIASES: AliasMap = {
...COMMON_ALIASES,
host: ["server", "serverName", "server_name"],
trust_server_certificate: ["trustServerCertificate"],
authentication: ["authenticationType", "auth_type", "authentication_type"],
azure_tenant_id: ["tenantId", "tenant_id", "azureTenantId"],
azure_client_id: ["clientId", "client_id", "azureClientId"],
azure_client_secret: ["clientSecret", "client_secret", "azureClientSecret"],
access_token: ["token", "accessToken"],
}

const ORACLE_ALIASES: AliasMap = {
Expand Down Expand Up @@ -104,6 +109,7 @@ const DRIVER_ALIASES: Record<string, AliasMap> = {
mariadb: MYSQL_ALIASES,
sqlserver: SQLSERVER_ALIASES,
mssql: SQLSERVER_ALIASES,
fabric: SQLSERVER_ALIASES,
oracle: ORACLE_ALIASES,
mongodb: MONGODB_ALIASES,
mongo: MONGODB_ALIASES,
Expand Down
Loading
Loading