Skip to content

feat: lexical groups (per-namespace morph + abbrev)#62

Merged
gladius merged 1 commit into
mainfrom
feat/lexical-groups
May 11, 2026
Merged

feat: lexical groups (per-namespace morph + abbrev)#62
gladius merged 1 commit into
mainfrom
feat/lexical-groups

Conversation

@gladius
Copy link
Copy Markdown
Owner

@gladius gladius commented May 11, 2026

Per-namespace, per-language token normalization that runs at tokenize time.
Two kinds: morph (inflection variants of one root) and abbrev (short forms of a phrase). Distinct from synonyms — only collapses tokens with the same surface meaning, so it doesn't pollute sibling intents the way the old L1 graph did.

Surface

  • Librarymicroresolve::{LexicalGroup, LexicalKind} re-exported from crate root. NamespaceHandle::{list, add, remove, update}_lexical_group.
  • ServerGET/POST /api/lexical-groups, DELETE/PATCH /api/lexical-groups/{idx}, POST /api/lexical-groups/suggest (LLM-proposed, operator-approved). Audit-chain integrated.
  • Studio UI — new "Lexicon" page under Build, Inflections / Abbreviations tabs, manual add + LLM Suggest with per-proposal approve/reject.
  • Python bindingsLexicalGroup class + 4 namespace methods.
  • Node bindingsLexicalGroup interface + 4 namespace methods (camelCase), index.d.ts regenerated.

Validation

Head-to-head ablation on EU AI Act pack (100 prohibited + 80 benign, 50 generic + 30 adjacent-legal carved out by Feb 2025 Commission guidelines):

thr=1.5  baseline:  P=0.779  R=0.727  F1=0.742
thr=1.5  with lex:  P=0.787  R=0.764  F1=0.766   (+2.3pp F1, +3.7pp R)

Improvement holds across full threshold sweep (0.8 → 2.5). Zero regression on CLINC150 + BANKING77. EU AI Act pack ships 10 morph groups + 3 abbreviations (rbi, ncii, csam).

Also lands

  • Post-Omnibus EU AI Act pack refresh (6 new + 6 modified intent JSONs)
  • benchmarks/eu_ai_act_eval.py ablation harness
  • benchmarks/test_lexical_groups.py smoke test
  • status: experimental tag on hipaa-triage / mcp-tools-generic / safety-filter

🤖 Generated with Claude Code

… UI, bindings

Per-namespace, per-language token normalization that runs at tokenize
time (both index-time on seeds and query-time on resolves). Two kinds:
morph (inflection variants of one root) and abbrev (short forms of a
phrase). Distinct from synonyms by design — only collapses tokens with
the same surface meaning, so it doesn't pollute sibling intents the way
the old L1 graph did.

Library
- microresolve::{LexicalGroup, LexicalKind} re-exported from crate root
- NamespaceHandle: list_lexical_groups, add_, remove_, update_
- src/lexical.rs (LexicalIndex, normalize_in_place) wired into all
  tokenize call-sites in scoring + resolver_*

Server
- GET/POST /api/lexical-groups
- DELETE/PATCH /api/lexical-groups/{idx}
- POST /api/lexical-groups/suggest (operator-triggered LLM proposals
  grounded in namespace vocab + intent descriptions; nothing applies
  until approved)
- Every mutation lands in the per-key audit chain
  (lexical_group.add / .remove / .update)

Studio UI
- New "Lexicon" page under Build with Inflections / Abbreviations tabs
- Manual add form + LLM Suggest panel with per-proposal approve/reject
- /lexical route + nav entry; existing /intents page untouched

Python bindings
- LexicalGroup pyclass + 4 namespace methods on Namespace

Node bindings
- LexicalGroup interface + 4 namespace methods (camelCase per napi);
  index.d.ts regenerated

Persistence
- Resolver.lexical_groups is the source of truth, persisted in
  _ns.json. Save now reads existing _ns.json first to preserve unknown
  fields (compliance_frameworks, policy_overrides) so save/load is
  round-trip safe across config the engine doesn't actively model.

EU AI Act pack
- Post-Omnibus refresh pulled forward from the tracing-explore branch:
  6 new intent JSONs (ai_generated_csam, legitimate_use, ncii_adult,
  realtime_remote_biometric_id, untargeted_facial_scraping) + 6
  refreshed (biometric_categorization, emotion_recognition_workplace,
  exploitation_vulnerability, predictive_policing, social_scoring,
  subliminal_manipulation)
- Pack ships 10 morph groups + 3 abbreviations (rbi, ncii, csam)
- benchmarks/eu_ai_act_eval.py: 100 prohibited + 80 benign harness
  (50 generic + 30 adjacent-legal carved out by Feb 2025 Commission
  guidelines)

Empirical (head-to-head re-run, lexical_groups removed vs present):
  thr=1.5  baseline:  P=0.779  R=0.727  F1=0.742
  thr=1.5  with lex:  P=0.787  R=0.764  F1=0.766   (+2.3pp F1, +3.7pp R)
Improvement holds across the full threshold sweep (0.8 → 2.5).
Zero regression on CLINC150 + BANKING77.

Other
- 'status: experimental' tag on hipaa-triage, mcp-tools-generic,
  safety-filter pack metadata
- .gitignore: ignore _skills/ (Claude Code local skills)
- benchmarks/test_lexical_groups.py: smoke test (4 query pairs)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@gladius gladius force-pushed the feat/lexical-groups branch from f388c8e to 2352f0d Compare May 11, 2026 02:17
@gladius gladius merged commit ca08e11 into main May 11, 2026
5 checks passed
@gladius gladius deleted the feat/lexical-groups branch May 11, 2026 02:20
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant