Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
29 changes: 27 additions & 2 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -66,8 +66,13 @@ azd env set POSTGRES_ENTRA_ADMIN_OBJECT_ID "<your-entra-object-id>"
azd env set POSTGRES_ENTRA_ADMIN_PRINCIPAL_NAME "<your-entra-upn>"
azd env set POSTGRES_ENTRA_ADMIN_PRINCIPAL_TYPE "User"
azd env set DUCKLAKE_DATA_PATH "az://lakehouse/data/"
azd env set LAKEHOUSE_SECRET_KEY "$(openssl rand -base64 32)"
```

`openssl rand -base64 32` generates 32 random bytes and stores their base64-encoded form, usually a 44-character string ending in `=`. Use that encoded string directly as `LAKEHOUSE_SECRET_KEY`; the server treats it as a UTF-8 HMAC/JWT signing key and does not base64-decode it.

Azure deployments require `LAKEHOUSE_SECRET_KEY` before `azd up` so the deployed Container App keeps a stable token-signing key across restarts and revisions.

Find the Entra values if you need them:

```bash
Expand All @@ -93,6 +98,7 @@ azd env set POSTGRES_ENTRA_ADMIN_OBJECT_ID "<your-entra-object-id>"
azd env set POSTGRES_ENTRA_ADMIN_PRINCIPAL_NAME "<your-entra-upn>"
azd env set POSTGRES_ENTRA_ADMIN_PRINCIPAL_TYPE "User"
azd env set DUCKLAKE_DATA_PATH "az://lakehouse/data/"
azd env set LAKEHOUSE_SECRET_KEY "$(openssl rand -base64 32)"

azd up
```
Expand Down Expand Up @@ -134,6 +140,25 @@ mvn -q -Dexec.mainClass=lakehouse.AzureDemo test-compile exec:java

The `MAVEN_OPTS` flag is required for Apache Arrow on Java 17+.

### Run the live backend tests

The live backend pytest is opt-in because it queries the deployed Azure Container App and reads the `lakehouse-password` secret from Key Vault:

```bash
LAKEHOUSE_LIVE_BACKEND=1 uv run pytest -q tests/test_live_azure_backend.py
```

That default path uses PyArrow to perform the Basic-token handshake, then gives the returned Bearer token to ADBC for the query. It verifies the deployed endpoint, TLS, Key Vault password, and Bearer auth path.

There is also a separate opt-in check for ADBC's direct Basic-auth path:

```bash
LAKEHOUSE_LIVE_BACKEND=1 LAKEHOUSE_LIVE_BACKEND_ADBC_BASIC=1 \
uv run pytest -q tests/test_live_azure_backend.py
```

The ADBC Basic check is marked `xfail` because that is the known client path currently failing against the deployed Container App. A result such as `1 passed, 1 xfailed` means the supported bearer smoke test passed and the tracked ADBC Basic issue reproduced as expected. If that changes to `1 passed, 1 xpassed`, the ADBC Basic path has started working and the `xfail` marker should be removed.

If you want one copy/paste block for the demo itself:

```bash
Expand Down Expand Up @@ -378,7 +403,7 @@ A few settings are environment-only (`.env` also works).
| Database | `--database` | `LAKEHOUSE_DATABASE` | `:memory:` | CLI + Env | DuckDB database path |
| Username | `--username` | `LAKEHOUSE_USERNAME` | `lakehouse` | CLI + Env | Auth username |
| Password | `--password` | `LAKEHOUSE_PASSWORD` | *(empty)* | CLI + Env | Auth password (empty disables auth) |
| Secret Key | `--secret-key` | `LAKEHOUSE_SECRET_KEY` | *(auto-generated)* | CLI + Env | HMAC / JWT signing key |
| Secret Key | `--secret-key` | `LAKEHOUSE_SECRET_KEY` | *(auto-generated locally; required for Azure deploy)* | CLI + Env | HMAC / JWT signing key |
| Health Port | `--health-check-port` | `LAKEHOUSE_HEALTH_CHECK_PORT` | `8081` | CLI + Env | gRPC health service port |
| Health Enabled | `--health-check-enabled` | `LAKEHOUSE_HEALTH_CHECK_ENABLED` | `true` | CLI + Env | Enable health check server |
| Log Level | `--log-level` | `LAKEHOUSE_LOG_LEVEL` | `INFO` | CLI + Env | Python log level |
Expand Down Expand Up @@ -495,7 +520,7 @@ Lakehouse implements all standard Flight SQL RPCs:
- **Azure Container Apps** — runs the Lakehouse Docker image
- **User-assigned managed identity** — attached to the Container App, with `Storage Blob Data Contributor` RBAC
- **PostgreSQL Entra admin principal** — granted `Storage Blob Data Contributor` RBAC for local DuckLake validation
- **Azure Key Vault** — stores the Lakehouse password
- **Azure Key Vault** — stores the Lakehouse password and stable HMAC/JWT signing key

A `postprovision` hook runs automatically to configure PostgreSQL Entra auth grants for the managed identity.

Expand Down
7 changes: 7 additions & 0 deletions infra/main.bicep
Original file line number Diff line number Diff line change
Expand Up @@ -76,6 +76,11 @@ param ducklakeDataPath string = 'az://lakehouse/data/'
@description('Auto-generated Flight SQL password. Random on each fresh provision.')
param lakehousePassword string = newGuid()

@secure()
@minLength(1)
@description('Stable non-empty UTF-8 HMAC/JWT signing key for the Flight SQL server.')
param lakehouseSecretKey string

var uniqueSuffix = toLower(uniqueString(subscription().id, resourceGroup().id, environmentName))
var storageAccountName = 'st${uniqueSuffix}'
var postgresServerName = toLower('psql-${environmentName}-${substring(uniqueSuffix, 0, 6)}')
Expand Down Expand Up @@ -126,6 +131,7 @@ module keyvault './modules/keyvault.bicep' = {
containerAppPrincipalId: containerAppIdentity.properties.principalId
deployerPrincipalId: postgresEntraAdminObjectId
lakehousePassword: lakehousePassword
lakehouseSecretKey: lakehouseSecretKey
}
}

Expand Down Expand Up @@ -156,6 +162,7 @@ module containerApp './modules/container-app.bicep' = {
ducklakeDataPath: ducklakeDataPath
acrLoginServer: acr.outputs.acrLoginServer
lakehousePasswordSecretUri: keyvault.outputs.lakehousePasswordSecretUri
lakehouseSecretKeySecretUri: keyvault.outputs.lakehouseSecretKeySecretUri
}
}

Expand Down
59 changes: 56 additions & 3 deletions infra/main.json
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,7 @@
"_generator": {
"name": "bicep",
"version": "0.40.2.10011",
"templateHash": "7370216803261364591"
"templateHash": "8435078362355386948"
}
},
"parameters": {
Expand Down Expand Up @@ -152,6 +152,13 @@
"metadata": {
"description": "Auto-generated Flight SQL password. Random on each fresh provision."
}
},
"lakehouseSecretKey": {
"type": "securestring",
"minLength": 1,
"metadata": {
"description": "Stable non-empty UTF-8 HMAC/JWT signing key for the Flight SQL server."
}
}
},
"variables": {
Expand Down Expand Up @@ -511,6 +518,9 @@
},
"lakehousePassword": {
"value": "[parameters('lakehousePassword')]"
},
"lakehouseSecretKey": {
"value": "[parameters('lakehouseSecretKey')]"
}
},
"template": {
Expand All @@ -520,7 +530,7 @@
"_generator": {
"name": "bicep",
"version": "0.40.2.10011",
"templateHash": "15964644140239373440"
"templateHash": "1804434252725837723"
}
},
"parameters": {
Expand Down Expand Up @@ -554,6 +564,12 @@
"metadata": {
"description": "Flight SQL password to store in the vault."
}
},
"lakehouseSecretKey": {
"type": "securestring",
"metadata": {
"description": "Flight SQL HMAC/JWT signing key to store in the vault."
}
}
},
"resources": [
Expand Down Expand Up @@ -622,6 +638,21 @@
"dependsOn": [
"[resourceId('Microsoft.KeyVault/vaults', parameters('keyVaultName'))]"
]
},
{
"type": "Microsoft.KeyVault/vaults/secrets",
"apiVersion": "2023-07-01",
"name": "[format('{0}/{1}', parameters('keyVaultName'), 'lakehouse-secret-key')]",
"properties": {
"value": "[parameters('lakehouseSecretKey')]",
"contentType": "text/plain",
"attributes": {
"enabled": true
}
},
"dependsOn": [
"[resourceId('Microsoft.KeyVault/vaults', parameters('keyVaultName'))]"
]
}
],
"outputs": {
Expand All @@ -636,6 +667,10 @@
"lakehousePasswordSecretUri": {
"type": "string",
"value": "[reference(resourceId('Microsoft.KeyVault/vaults/secrets', parameters('keyVaultName'), 'lakehouse-password'), '2023-07-01').secretUri]"
},
"lakehouseSecretKeySecretUri": {
"type": "string",
"value": "[reference(resourceId('Microsoft.KeyVault/vaults/secrets', parameters('keyVaultName'), 'lakehouse-secret-key'), '2023-07-01').secretUri]"
}
}
}
Expand Down Expand Up @@ -809,6 +844,9 @@
},
"lakehousePasswordSecretUri": {
"value": "[reference(resourceId('Microsoft.Resources/deployments', 'keyvault'), '2025-04-01').outputs.lakehousePasswordSecretUri.value]"
},
"lakehouseSecretKeySecretUri": {
"value": "[reference(resourceId('Microsoft.Resources/deployments', 'keyvault'), '2025-04-01').outputs.lakehouseSecretKeySecretUri.value]"
}
},
"template": {
Expand All @@ -818,7 +856,7 @@
"_generator": {
"name": "bicep",
"version": "0.40.2.10011",
"templateHash": "6413779264902107673"
"templateHash": "7408886526487755529"
}
},
"parameters": {
Expand Down Expand Up @@ -869,6 +907,12 @@
"metadata": {
"description": "Key Vault secret URI for the Flight SQL password."
}
},
"lakehouseSecretKeySecretUri": {
"type": "string",
"metadata": {
"description": "Key Vault secret URI for the Flight SQL HMAC/JWT signing key."
}
}
},
"resources": [
Expand Down Expand Up @@ -930,6 +974,11 @@
"name": "lakehouse-password",
"keyVaultUrl": "[parameters('lakehousePasswordSecretUri')]",
"identity": "[parameters('containerAppIdentityId')]"
},
{
"name": "lakehouse-secret-key",
"keyVaultUrl": "[parameters('lakehouseSecretKeySecretUri')]",
"identity": "[parameters('containerAppIdentityId')]"
}
],
"registries": [
Expand Down Expand Up @@ -981,6 +1030,10 @@
{
"name": "LAKEHOUSE_PASSWORD",
"secretRef": "lakehouse-password"
},
{
"name": "LAKEHOUSE_SECRET_KEY",
"secretRef": "lakehouse-secret-key"
}
],
"probes": [
Expand Down
3 changes: 3 additions & 0 deletions infra/main.parameters.json
Original file line number Diff line number Diff line change
Expand Up @@ -31,6 +31,9 @@
},
"ducklakeDataPath": {
"value": "${DUCKLAKE_DATA_PATH}"
},
"lakehouseSecretKey": {
"value": "${LAKEHOUSE_SECRET_KEY}"
}
}
}
12 changes: 12 additions & 0 deletions infra/modules/container-app.bicep
Original file line number Diff line number Diff line change
Expand Up @@ -16,6 +16,9 @@ param acrLoginServer string
@description('Key Vault secret URI for the Flight SQL password.')
param lakehousePasswordSecretUri string

@description('Key Vault secret URI for the Flight SQL HMAC/JWT signing key.')
param lakehouseSecretKeySecretUri string

resource logAnalyticsWorkspace 'Microsoft.OperationalInsights/workspaces@2022-10-01' = {
name: 'law-${containerAppsEnvironmentName}'
location: location
Expand Down Expand Up @@ -68,6 +71,11 @@ resource containerApp 'Microsoft.App/containerApps@2024-03-01' = {
keyVaultUrl: lakehousePasswordSecretUri
identity: containerAppIdentityId
}
{
name: 'lakehouse-secret-key'
keyVaultUrl: lakehouseSecretKeySecretUri
identity: containerAppIdentityId
}
]
registries: [
{
Expand Down Expand Up @@ -119,6 +127,10 @@ resource containerApp 'Microsoft.App/containerApps@2024-03-01' = {
name: 'LAKEHOUSE_PASSWORD'
secretRef: 'lakehouse-password'
}
{
name: 'LAKEHOUSE_SECRET_KEY'
secretRef: 'lakehouse-secret-key'
}
]
probes: [
{
Expand Down
21 changes: 21 additions & 0 deletions infra/modules/keyvault.bicep
Original file line number Diff line number Diff line change
Expand Up @@ -14,6 +14,10 @@ param deployerPrincipalId string = ''
@description('Flight SQL password to store in the vault.')
param lakehousePassword string

@secure()
@description('Flight SQL HMAC/JWT signing key to store in the vault.')
param lakehouseSecretKey string

resource keyVault 'Microsoft.KeyVault/vaults@2023-07-01' = {
name: keyVaultName
location: location
Expand Down Expand Up @@ -76,9 +80,26 @@ resource lakehousePasswordSecret 'Microsoft.KeyVault/vaults/secrets@2023-07-01'
}
}

// ── Secret: lakehouse-secret-key ────────────────────────────────────────
resource lakehouseSecretKeySecret 'Microsoft.KeyVault/vaults/secrets@2023-07-01' = {
parent: keyVault
name: 'lakehouse-secret-key'
properties: {
value: lakehouseSecretKey
contentType: 'text/plain'
attributes: {
enabled: true
}
}
}

output keyVaultName string = keyVault.name
output keyVaultUri string = keyVault.properties.vaultUri

// URI only (no secret value) — safe to expose for Container App Key Vault references.
#disable-next-line outputs-should-not-contain-secrets
output lakehousePasswordSecretUri string = lakehousePasswordSecret.properties.secretUri

// URI only (no secret value) — safe to expose for Container App Key Vault references.
#disable-next-line outputs-should-not-contain-secrets
output lakehouseSecretKeySecretUri string = lakehouseSecretKeySecret.properties.secretUri
Loading
Loading