Skip to content

Feature: Time-decay and usage-weighted relevance scoring for search #603

@bm-clawd

Description

@bm-clawd

Motivation

Currently all notes/entities are weighted equally in search results. A stale observation from three weeks ago ranks the same as a correction that saved 30 minutes yesterday. As knowledge bases grow, retrieval quality degrades because there's no signal for which memories are actually useful.

Inspired by memelord's approach (Turso blog), which combines cosine similarity with time decay and usage weighting:

ORDER BY (1.0 - vector_distance_cos(embedding, vector32(?)))
  * POWER(0.995, julianday('now') - julianday(created_at))

Proposed Changes

Phase 1: Time-decay scoring (low effort, high impact)

  • Add a time-decay factor to search result ranking
  • Recent notes score higher than old ones at equal similarity
  • Configurable decay rate (default ~0.995/day means half-life of ~138 days)
  • This is purely a scoring change — no schema or storage changes needed

Phase 2: Usage-weighted scoring (medium effort)

  • Track when a note/entity is retrieved via search or build_context
  • Track when retrieved notes are actually used in the conversation (accessed via read_note after appearing in search results)
  • Boost frequently-useful notes, decay ignored ones
  • Could be as simple as a last_accessed_at + access_count on entities

Phase 3: Feedback loop (higher effort, ties into SPEC-28)

  • Allow agents to explicitly rate memory usefulness (upvote/downvote)
  • Automatic contradiction detection — when an agent stores a correction, the corrected memory gets downweighted
  • Self-cleaning: memories below a weight threshold get flagged for review/archival

Why This Matters

BM's differentiator is human-readable plain text. But search ranking doesn't have to be naive just because storage is simple. We can have sophisticated retrieval on top of transparent storage.

The memelord approach validates this — they use SQLite + vector search + weighted scoring. We already have the first two. Adding weighted scoring closes the gap while keeping our plain-text advantage.

References

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions