feat: Allow adding unique constraints to existing tables#4465
feat: Allow adding unique constraints to existing tables#4465Ludv1gL wants to merge 1 commit intoclockworklabs:masterfrom
Conversation
|
Interesting, thank you for filing this! We'll work on getting it reviewed. |
Centril
left a comment
There was a problem hiding this comment.
This needs some work to actually implement the feature.
crates/core/src/db/update.rs
Outdated
| ) => { | ||
| let table_def = plan.new.stored_in_table_def(constraint_name).unwrap(); | ||
| let constraint_def = &table_def.constraints[constraint_name]; | ||
| let cols = constraint_def.data.unique_columns().unwrap(); |
There was a problem hiding this comment.
Replace with an expect here please
There was a problem hiding this comment.
This entire block will be removed along with the HashMap-based duplicate detection. #4542 detects duplicates directly in create_constraint() via iter_duplicates().
crates/core/src/db/update.rs
Outdated
| // Scan all rows, count occurrences of each projected value | ||
| let mut value_counts: std::collections::HashMap<AlgebraicValue, usize> = | ||
| std::collections::HashMap::new(); | ||
| for row in stdb.iter_mut(tx, table_id)? { | ||
| let val = row.project(&col_list)?; | ||
| *value_counts.entry(val).or_insert(0) += 1; | ||
| } |
There was a problem hiding this comment.
This seems rather inefficient. Every TableIndex index already knows what keys have duplicates through SameKeyEntry.
I'd like to see the implementation do this instead:
- Find the
index_ids affected. - Ask each affected index for the list of duplicates via a new method
TableIndex::iter_duplicates(&self) -> impl Iterator<Item = (AlgebraicValue, usize)>which wherecount: usize > 1. This should forward toIndex::iter_duplicatesfor each index impl. Unique indices can return an empty iterator. - To
SameKeyEntryadd a methodfn count(&self) -> usizewhich is used foriter_duplicatesabove. - You then need to merge together the results of different indices by merging together the
(AlgebraicValue, usize)entries for each index. - Proceed with the rest of the implementation as-is.
There was a problem hiding this comment.
Done in #4542:
SameKeyEntry::count()added in same_key_entry.rsiter_duplicates()added to bothMultiMapandHashIndex— iterates entries where count() > 1TypedIndex::iter_duplicates()in mod.rs matches on all non-unique variants, calls the inneriter_duplicates(), and converts typed keys toAlgebraicValue. Unique variants return an empty vec.TableIndex::iter_duplicates()is a thin wrapper delegating to self.idx- The
CheckAddUniqueConstraintValidprecheck and its HashMap scanning handler in update.rs are both removed — duplicate detection now happens insidecreate_constraint()using the index's owniter_duplicates().
crates/core/src/db/update.rs
Outdated
| let table_id = stdb.table_id_from_name_mut(tx, &table_def.name)?.unwrap(); | ||
|
|
||
| // Scan all rows, count occurrences of each projected value | ||
| let mut value_counts: std::collections::HashMap<AlgebraicValue, usize> = |
There was a problem hiding this comment.
Please use spacetimedb_data_structures::map::HashMap instead, with an unqualified import.
There was a problem hiding this comment.
All HashMap-based detection is removed so point is moot.
There was a problem hiding this comment.
This needs to be addressed here. In fact, I don't think rollback is being handled correctly for ConstraintAdded in rollback_pending_schema_change.
I'd also like to see a transactionality tests (test_create_constraint_is_transactional) for create_constraint and drop_constraint on a single-col and multi-col index.
The NOTE here also doesn't seem to be true, as I don't see the index being recreated unless the algorithm changes, which it doesn't if only the uniqueness changes.
There was a problem hiding this comment.
Rollback: PendingSchemaChange::ConstraintAdded now carries IndexId and Option<PointerMap> for proper rollback. On rollback, committed_state.rs calls make_non_unique() on the affected index and restores the pointer map if one was taken. Similarly, ConstraintRemoved rollback calls make_unique() to re-enforce uniqueness.
Transactionality tests: new tests in datastore.rs:
test_create_constraint_is_transactional(single-col): creates non-unique index -> inserts unique rows -> create_constraint -> verifies index is unique -> rollback -> verifies index is non-unique againtest_create_constraint_is_transactional(multi-col): same with multi-column indextest_create_constraint_fails_with_duplicates: inserts duplicate rows -> verifies create_constraint returns error -> verifies table unchangedtest_drop_constraint_is_transactional(single-col + multi-col): creates unique constraint -> drop_constraint -> rollback -> verifies constraint still enforced
Above comment has been removed, as make_unique() just converts the existing non-unique index in-place to a unique one.
crates/schema/src/auto_migrate.rs
Outdated
| result, | ||
| AutoMigrateError::AddUniqueConstraint { constraint } => &constraint[..] == apples_name_unique_constraint | ||
| ); | ||
| // Note: AddUniqueConstraint is no longer an error - adding unique constraints |
There was a problem hiding this comment.
| // Note: AddUniqueConstraint is no longer an error - adding unique constraints | |
| // Note: `AddUniqueConstraint` is no longer an error - adding unique constraints |
| /// - The constraint metadata is inserted into the system tables (and other data structures reflecting them). | ||
| /// - The returned ID is unique and is not `constraintId::SENTINEL`. | ||
| fn create_constraint(&mut self, mut constraint: ConstraintSchema) -> Result<ConstraintId> { | ||
| pub fn create_constraint(&mut self, mut constraint: ConstraintSchema) -> Result<ConstraintId> { |
There was a problem hiding this comment.
Note that this doesn't actually change the in-memory table to use a unique index instead of a non-unique index, so you are achieving the feature described.
Instead, you are only inserting a row into the system tables.
I would rename this to create_st_constraint and the drop one as well. The create_table will use the renamed method. Then you can add create_constraint which will call create_st_constraint and also find the commit index and tx index and call a new method TableIndex::make_unique(&mut self) -> Result<(), on both, as well as checking commit_index.can_merge(&tx_index, is_deleted)
There was a problem hiding this comment.
Oooh I assumed adding it to the system table would be enough and that it would hot reload from there on publish like on a module restart. So I guess my previous code worked on restarting the module. But if I had managed to insert duplicate indices before restarting, horrible_things() might have happened.
Anyhow, in #4542:
Renames:
create_constraint()→create_st_constraint()(metadata-only, used bycreate_table)drop_constraint()→drop_st_constraint()(metadata-only)
New create_constraint():
- Calls
create_st_constraint()for system table metadata - Finds the index by constraint columns via
find_index_for_constraint() - Checks for duplicates via
iter_duplicates()(returns error with details if any found) - Calls
make_unique()on both commit-table and tx-table indices - Calls
can_merge()between commit and tx indices - Takes pointer map from table if this becomes the first unique index (pointer map invariant: present iff no unique index exists)
- Pushes enriched
PendingSchemaChange::ConstraintAdded { table_id, constraint_id, index_id, pointer_map }for rollback
New drop_constraint(): Reverse — calls make_non_unique(), restores pointer map if no unique indices remain.
Index conversion infrastructure:
MultiMap::check_and_into_unique()/HashIndex::check_and_into_unique()— validates no duplicates, converts to unique mapUniqueMap::from_non_unique()/UniqueHashIndex::from_non_unique()— constructors from non-unique dataTypedIndex::make_unique()— matches all 36 non-unique variants → corresponding unique variant usingmem::replaceTypedIndex::make_non_unique()— reverse for rollback- Macro
define_uniqueness_conversions!generates both directions for all variant pairs
Addresses review feedback on PR clockworklabs#4465. The previous implementation only inserted metadata into system tables but never converted the in-memory index from non-unique to unique. Changes: - Add SameKeyEntry::count(), iter_duplicates(), and check_and_into_unique() on MultiMap/HashIndex for duplicate detection via existing index infra - Add from_non_unique()/into_non_unique() on UniqueMap/UniqueHashIndex for lossless conversion between unique and non-unique index types - Add TypedIndex::make_unique()/make_non_unique()/iter_duplicates() via define_uniqueness_conversions! macro covering all 36 variant pairs - Add Table::take_pointer_map()/restore_pointer_map()/has_unique_index() - Rename create_constraint -> create_st_constraint (metadata-only, used by create_table), add new create_constraint that calls create_st_constraint then makes index unique on both tx and commit tables with can_merge check - Same pattern for drop_constraint/drop_st_constraint - Enrich PendingSchemaChange::ConstraintAdded with IndexId and PointerMap for correct rollback (make_non_unique + restore pointer map) - Remove CheckAddUniqueConstraintValid precheck (duplicate detection now happens inside create_constraint using the index directly) - Add MutTxDatastore::create_constraint_mut_tx and RelationalDB wrapper - Add AddConstraint formatter and remove dead AddUniqueConstraint error - Add 6 transactionality tests for create/drop constraint Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
9b7d651 to
4c2518c
Compare
Summary
Adding
#[unique]or#[primary_key]to an existing column currently triggersAutoMigrateError::AddUniqueConstraint, forcing a full database clear to apply the schema change. This PR makes it a non-breaking migration by validating existing data first:Changes
auto_migrate.rs: Replace hardAddUniqueConstrainterror withCheckAddUniqueConstraintValidprecheck +AddConstraintmigration stepupdate.rs: Implement precheck (full table scan, project constrained columns, count duplicates) andAddConstraintstep executionrelational_db.rs: Exposecreate_constraint()(counterpart to existingdrop_constraint())traits.rs/datastore.rs: Addcreate_constraint_mut_txtoMutTxDatastoretraitmut_tx.rs: Makecreate_constraintpublicformatter.rs: Format the newAddConstraintstepSafety
MutTx— no window for concurrent duplicate insertsauto_migrate_indexes()already handles adding the backing btree index (withis_unique=truefrom the new schema). The constraint step only adds metadata.Example error output
Test plan
auto_migratetests passcargo checkpasses forspacetimedb-schemaandspacetimedb-coreAddUniqueConstrainterror test is updated#[unique]to existing column with clean data → succeeds#[unique]to existing column with duplicates → fails with detailed error🤖 Generated with Claude Code