Skip to content

feat: Allow adding unique constraints to existing tables#4465

Open
Ludv1gL wants to merge 1 commit intoclockworklabs:masterfrom
Ludv1gL:feat/allow-add-unique-constraint
Open

feat: Allow adding unique constraints to existing tables#4465
Ludv1gL wants to merge 1 commit intoclockworklabs:masterfrom
Ludv1gL:feat/allow-add-unique-constraint

Conversation

@Ludv1gL
Copy link
Contributor

@Ludv1gL Ludv1gL commented Feb 26, 2026

Summary

Adding #[unique] or #[primary_key] to an existing column currently triggers AutoMigrateError::AddUniqueConstraint, forcing a full database clear to apply the schema change. This PR makes it a non-breaking migration by validating existing data first:

  • If all values are unique: constraint is added seamlessly (non-breaking migration)
  • If duplicates exist: migration fails with a detailed error listing up to 10 duplicate groups

Changes

  • auto_migrate.rs: Replace hard AddUniqueConstraint error with CheckAddUniqueConstraintValid precheck + AddConstraint migration step
  • update.rs: Implement precheck (full table scan, project constrained columns, count duplicates) and AddConstraint step execution
  • relational_db.rs: Expose create_constraint() (counterpart to existing drop_constraint())
  • traits.rs / datastore.rs: Add create_constraint_mut_tx to MutTxDatastore trait
  • mut_tx.rs: Make create_constraint public
  • formatter.rs: Format the new AddConstraint step

Safety

  • Transaction safety: Precheck and constraint creation run in the same MutTx — no window for concurrent duplicate inserts
  • Index creation: auto_migrate_indexes() already handles adding the backing btree index (with is_unique=true from the new schema). The constraint step only adds metadata.
  • Rollback: If the precheck finds duplicates, the entire migration aborts before any changes are applied
  • Error quality: Duplicate error shows table name, column names, and up to 10 example duplicate values with counts

Example error output

Precheck failed: cannot add unique constraint 'Users_email_key' on table 'Users' column(s) [email]:
3 duplicate group(s) found.
  - String("alice@example.com") appears 2 times
  - String("bob@example.com") appears 3 times
  - String("charlie@example.com") appears 2 times

Test plan

  • All 12 auto_migrate tests pass
  • cargo check passes for spacetimedb-schema and spacetimedb-core
  • Verified the previously-expected AddUniqueConstraint error test is updated
  • Manual test: add #[unique] to existing column with clean data → succeeds
  • Manual test: add #[unique] to existing column with duplicates → fails with detailed error

🤖 Generated with Claude Code

@Centril Centril self-requested a review February 26, 2026 13:34
@bfops
Copy link
Collaborator

bfops commented Feb 26, 2026

Interesting, thank you for filing this! We'll work on getting it reviewed.

Copy link
Contributor

@Centril Centril left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This needs some work to actually implement the feature.

) => {
let table_def = plan.new.stored_in_table_def(constraint_name).unwrap();
let constraint_def = &table_def.constraints[constraint_name];
let cols = constraint_def.data.unique_columns().unwrap();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Replace with an expect here please

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This entire block will be removed along with the HashMap-based duplicate detection. #4542 detects duplicates directly in create_constraint() via iter_duplicates().

Comment on lines +146 to +152
// Scan all rows, count occurrences of each projected value
let mut value_counts: std::collections::HashMap<AlgebraicValue, usize> =
std::collections::HashMap::new();
for row in stdb.iter_mut(tx, table_id)? {
let val = row.project(&col_list)?;
*value_counts.entry(val).or_insert(0) += 1;
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems rather inefficient. Every TableIndex index already knows what keys have duplicates through SameKeyEntry.

I'd like to see the implementation do this instead:

  1. Find the index_ids affected.
  2. Ask each affected index for the list of duplicates via a new method TableIndex::iter_duplicates(&self) -> impl Iterator<Item = (AlgebraicValue, usize)> which where count: usize > 1. This should forward to Index::iter_duplicates for each index impl. Unique indices can return an empty iterator.
  3. To SameKeyEntry add a method fn count(&self) -> usize which is used for iter_duplicates above.
  4. You then need to merge together the results of different indices by merging together the (AlgebraicValue, usize) entries for each index.
  5. Proceed with the rest of the implementation as-is.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done in #4542:

  1. SameKeyEntry::count() added in same_key_entry.rs
  2. iter_duplicates() added to both MultiMap and HashIndex — iterates entries where count() > 1
  3. TypedIndex::iter_duplicates() in mod.rs matches on all non-unique variants, calls the inner iter_duplicates(), and converts typed keys to AlgebraicValue. Unique variants return an empty vec.
  4. TableIndex::iter_duplicates() is a thin wrapper delegating to self.idx
  5. The CheckAddUniqueConstraintValid precheck and its HashMap scanning handler in update.rs are both removed — duplicate detection now happens inside create_constraint() using the index's own iter_duplicates().

let table_id = stdb.table_id_from_name_mut(tx, &table_def.name)?.unwrap();

// Scan all rows, count occurrences of each projected value
let mut value_counts: std::collections::HashMap<AlgebraicValue, usize> =
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please use spacetimedb_data_structures::map::HashMap instead, with an unqualified import.

Copy link
Contributor Author

@Ludv1gL Ludv1gL Mar 4, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All HashMap-based detection is removed so point is moot.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This needs to be addressed here. In fact, I don't think rollback is being handled correctly for ConstraintAdded in rollback_pending_schema_change.

I'd also like to see a transactionality tests (test_create_constraint_is_transactional) for create_constraint and drop_constraint on a single-col and multi-col index.

The NOTE here also doesn't seem to be true, as I don't see the index being recreated unless the algorithm changes, which it doesn't if only the uniqueness changes.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rollback: PendingSchemaChange::ConstraintAdded now carries IndexId and Option<PointerMap> for proper rollback. On rollback, committed_state.rs calls make_non_unique() on the affected index and restores the pointer map if one was taken. Similarly, ConstraintRemoved rollback calls make_unique() to re-enforce uniqueness.

Transactionality tests: new tests in datastore.rs:

  • test_create_constraint_is_transactional (single-col): creates non-unique index -> inserts unique rows -> create_constraint -> verifies index is unique -> rollback -> verifies index is non-unique again
  • test_create_constraint_is_transactional (multi-col): same with multi-column index
  • test_create_constraint_fails_with_duplicates: inserts duplicate rows -> verifies create_constraint returns error -> verifies table unchanged
  • test_drop_constraint_is_transactional (single-col + multi-col): creates unique constraint -> drop_constraint -> rollback -> verifies constraint still enforced

Above comment has been removed, as make_unique() just converts the existing non-unique index in-place to a unique one.

result,
AutoMigrateError::AddUniqueConstraint { constraint } => &constraint[..] == apples_name_unique_constraint
);
// Note: AddUniqueConstraint is no longer an error - adding unique constraints
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// Note: AddUniqueConstraint is no longer an error - adding unique constraints
// Note: `AddUniqueConstraint` is no longer an error - adding unique constraints

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed formatting

/// - The constraint metadata is inserted into the system tables (and other data structures reflecting them).
/// - The returned ID is unique and is not `constraintId::SENTINEL`.
fn create_constraint(&mut self, mut constraint: ConstraintSchema) -> Result<ConstraintId> {
pub fn create_constraint(&mut self, mut constraint: ConstraintSchema) -> Result<ConstraintId> {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note that this doesn't actually change the in-memory table to use a unique index instead of a non-unique index, so you are achieving the feature described.
Instead, you are only inserting a row into the system tables.

I would rename this to create_st_constraint and the drop one as well. The create_table will use the renamed method. Then you can add create_constraint which will call create_st_constraint and also find the commit index and tx index and call a new method TableIndex::make_unique(&mut self) -> Result<(), on both, as well as checking commit_index.can_merge(&tx_index, is_deleted)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oooh I assumed adding it to the system table would be enough and that it would hot reload from there on publish like on a module restart. So I guess my previous code worked on restarting the module. But if I had managed to insert duplicate indices before restarting, horrible_things() might have happened.
Anyhow, in #4542:

Renames:

  • create_constraint()create_st_constraint() (metadata-only, used by create_table)
  • drop_constraint()drop_st_constraint() (metadata-only)

New create_constraint():

  1. Calls create_st_constraint() for system table metadata
  2. Finds the index by constraint columns via find_index_for_constraint()
  3. Checks for duplicates via iter_duplicates() (returns error with details if any found)
  4. Calls make_unique() on both commit-table and tx-table indices
  5. Calls can_merge() between commit and tx indices
  6. Takes pointer map from table if this becomes the first unique index (pointer map invariant: present iff no unique index exists)
  7. Pushes enriched PendingSchemaChange::ConstraintAdded { table_id, constraint_id, index_id, pointer_map } for rollback

New drop_constraint(): Reverse — calls make_non_unique(), restores pointer map if no unique indices remain.

Index conversion infrastructure:

  • MultiMap::check_and_into_unique() / HashIndex::check_and_into_unique() — validates no duplicates, converts to unique map
  • UniqueMap::from_non_unique() / UniqueHashIndex::from_non_unique() — constructors from non-unique data
  • TypedIndex::make_unique() — matches all 36 non-unique variants → corresponding unique variant using mem::replace
  • TypedIndex::make_non_unique() — reverse for rollback
  • Macro define_uniqueness_conversions! generates both directions for all variant pairs

Addresses review feedback on PR clockworklabs#4465. The previous implementation only
inserted metadata into system tables but never converted the in-memory
index from non-unique to unique.

Changes:
- Add SameKeyEntry::count(), iter_duplicates(), and check_and_into_unique()
  on MultiMap/HashIndex for duplicate detection via existing index infra
- Add from_non_unique()/into_non_unique() on UniqueMap/UniqueHashIndex
  for lossless conversion between unique and non-unique index types
- Add TypedIndex::make_unique()/make_non_unique()/iter_duplicates() via
  define_uniqueness_conversions! macro covering all 36 variant pairs
- Add Table::take_pointer_map()/restore_pointer_map()/has_unique_index()
- Rename create_constraint -> create_st_constraint (metadata-only, used
  by create_table), add new create_constraint that calls create_st_constraint
  then makes index unique on both tx and commit tables with can_merge check
- Same pattern for drop_constraint/drop_st_constraint
- Enrich PendingSchemaChange::ConstraintAdded with IndexId and PointerMap
  for correct rollback (make_non_unique + restore pointer map)
- Remove CheckAddUniqueConstraintValid precheck (duplicate detection now
  happens inside create_constraint using the index directly)
- Add MutTxDatastore::create_constraint_mut_tx and RelationalDB wrapper
- Add AddConstraint formatter and remove dead AddUniqueConstraint error
- Add 6 transactionality tests for create/drop constraint

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@Ludv1gL Ludv1gL force-pushed the feat/allow-add-unique-constraint branch from 9b7d651 to 4c2518c Compare March 4, 2026 17:13
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants