Skip to content

feat: add Tentative Knowledge Layer for emerging patterns#23

Open
peiyuan-ran-huang wants to merge 4 commits intoblader:mainfrom
peiyuan-ran-huang:feat/tentative-knowledge-layer
Open

feat: add Tentative Knowledge Layer for emerging patterns#23
peiyuan-ran-huang wants to merge 4 commits intoblader:mainfrom
peiyuan-ran-huang:feat/tentative-knowledge-layer

Conversation

@peiyuan-ran-huang
Copy link
Copy Markdown

Summary

Introduces a Tentative Knowledge Layer — a lightweight mechanism to capture emerging patterns that don't yet meet all 4 Quality Criteria (Reusable, Non-trivial, Specific, Verified) for full skill extraction. Instead of discarding potentially valuable knowledge or creating premature skills, tentative notes accumulate evidence through repeated observations until they are ready for promotion to full skills.

Problem

Currently, Claudeception has a binary decision: either a discovery meets all quality criteria and becomes a full skill, or it gets discarded. This creates a gap — patterns observed once but not yet verified across contexts are lost, even if they could become valuable skills with more evidence.

Solution

  • Step 2.5 Triage: After identifying knowledge (Step 2), a new decision point routes it to full skill extraction, tentative note creation, or discard based on how many quality criteria are met
  • Confidence scoring: Notes start at 0.4 and accumulate confidence through repeated observations (+0.15 same context, +0.20 different context, +0.30 user confirmation, −0.20 counter-examples), clamped to [0.1, 0.95]
  • Automatic promotion: When confidence reaches 0.7 with 2+ observations from distinct sessions, the note is suggested for promotion to a full skill during Retrospective Mode
  • Expiry: Stale notes are auto-cleaned (90-day warning, 180-day delete) to prevent clutter
  • Anti-gaming: User confirmations adjust confidence but do NOT count as observations, preventing single-observation notes from being fast-tracked to promotion

Files changed

File Change
SKILL.md Added Step 2.5, Tentative Knowledge Management section, updated Retrospective Mode and Skill Lifecycle
resources/instinct-template.yaml New — YAML template for tentative notes
resources/tentative-knowledge.md New — Detailed rules for confidence arithmetic, promotion, expiry, deduplication
README.md Added Tentative Knowledge section
WARP.md Updated project overview and key files list

Research basis

The tentative knowledge concept draws from experience pool patterns in EvoFSM (Zhang et al., 2026) and the skill library research already referenced by Claudeception (Voyager, CASCADE, Reflexion). The key insight is that knowledge acquisition is a gradual process — requiring multiple observations before codification reduces noise while capturing patterns that would otherwise be lost.

Test plan

  • Verify /claudeception retrospective mode correctly scans memory/tentative/ for eligible notes
  • Confirm Step 2.5 triage correctly routes knowledge based on quality criteria
  • Test confidence arithmetic with edge cases (simultaneous events, counter-examples)
  • Verify promotion protocol requires genuinely distinct sessions, not just different dates
  • Confirm expiry rules correctly identify stale notes

🤖 Generated with Claude Code

Introduce a lightweight mechanism to capture knowledge that doesn't yet meet
all 4 Quality Criteria for full skill extraction. Tentative notes live as YAML
files in memory/tentative/ with confidence scoring, automatic promotion to full
skills after repeated cross-context observations, and expiry for stale patterns.

Key additions:
- Step 2.5 triage: route knowledge to full skill vs tentative note vs discard
- Confidence arithmetic: +0.15/+0.20/+0.30/-0.20 with [0.1, 0.95] clamping
- Promotion protocol: confidence >= 0.7 + 2 observations from distinct sessions
- Expiry rules: 90-day stale warning, 180-day auto-delete, early delete for low confidence
- YAML template (resources/instinct-template.yaml) and detailed rules document
- Updated Retrospective Mode to scan tentative notes for promotion candidates
- Updated Skill Lifecycle to include tentative stage (step 0)

Inspired by experience pool patterns in EvoFSM (Zhang et al., 2026) and
skill library research from Voyager, CASCADE, and Reflexion.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Copilot AI review requested due to automatic review settings April 5, 2026 19:30
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds a “Tentative Knowledge Layer” to Claudeception’s skill extraction workflow so emerging patterns can be captured as lightweight YAML notes with confidence scoring and later promoted (or expired), instead of being discarded when they don’t yet satisfy all Quality Criteria.

Changes:

  • Introduces Step 2.5 triage to route discoveries to full skills, tentative notes, or discard.
  • Documents tentative-note confidence scoring, promotion rules, and expiry/cleanup behavior.
  • Adds new resources: a tentative note YAML template and a detailed rules reference doc.

Reviewed changes

Copilot reviewed 5 out of 5 changed files in this pull request and generated 5 comments.

Show a summary per file
File Description
SKILL.md Adds Step 2.5 triage + tentative knowledge management rules; updates retrospective flow and lifecycle.
resources/instinct-template.yaml Introduces YAML template for tentative notes stored under memory/tentative/.
resources/tentative-knowledge.md Provides detailed confidence arithmetic, promotion protocol, expiry rules, and dedup strategy.
README.md Summarizes tentative knowledge concept for users and links to the new resources.
WARP.md Updates project overview and key-file index to include tentative knowledge resources.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread SKILL.md
Comment on lines +122 to +128
| Specific is met, plus at least 1 other criterion has partial evidence | **Tentative note** → see below |
| Cannot describe a clear trigger + action (Specific not met) | **Discard** — not worth capturing |

"Partial evidence" means at least Non-trivial or Reusable shows initial signs (e.g., "this pattern
likely applies elsewhere but hasn't been verified across contexts"). A 3-of-4 case (e.g., Reusable +
Non-trivial + Specific but not Verified) takes the tentative path — missing any criterion disqualifies
from full skill extraction.
Copy link

Copilot AI Apr 5, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In Step 2.5, the “Tentative note” row says “Specific is met, plus at least 1 other criterion has partial evidence”, but the paragraph below defines “partial evidence” specifically as early signs of Non-trivial or Reusable. To avoid mis-triage (e.g., treating “Specific + Verified” as tentative even if it’s trivial/non-reusable), consider tightening the table row wording to explicitly reference partial evidence for Non-trivial/Reusable (or clarify whether Verified can be the partial criterion).

Suggested change
| Specific is met, plus at least 1 other criterion has partial evidence | **Tentative note** → see below |
| Cannot describe a clear trigger + action (Specific not met) | **Discard** — not worth capturing |
"Partial evidence" means at least Non-trivial or Reusable shows initial signs (e.g., "this pattern
likely applies elsewhere but hasn't been verified across contexts"). A 3-of-4 case (e.g., Reusable +
Non-trivial + Specific but not Verified) takes the tentative path — missing any criterion disqualifies
from full skill extraction.
| Specific is met, plus Non-trivial or Reusable has partial evidence | **Tentative note** → see below |
| Cannot describe a clear trigger + action (Specific not met) | **Discard** — not worth capturing |
"Partial evidence" means early signs of Non-trivial or Reusable (e.g., "this pattern
likely applies elsewhere but hasn't been verified across contexts"). Verified alone does not qualify
a note for the tentative path. A 3-of-4 case (e.g., Reusable + Non-trivial + Specific but not Verified)
takes the tentative path — missing any criterion disqualifies from full skill extraction.

Copilot uses AI. Check for mistakes.
Comment thread SKILL.md
Comment on lines +342 to +345
A tentative note is eligible for promotion when **both** conditions are met:
- `confidence >= 0.7`
- `observations >= 2` from **>= 2 distinct sessions or dates**

Copy link

Copilot AI Apr 5, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Promotion eligibility currently allows observations from “>= 2 distinct sessions or dates”. The PR description/test plan emphasizes distinct sessions (not just different dates), and using dates can be gamed (or accidentally satisfied by crossing midnight) without truly distinct sessions. Consider requiring distinct session identifiers/contexts, with “date” only as a fallback when session data is unavailable (and document that fallback explicitly).

Copilot uses AI. Check for mistakes.
Comment thread README.md
Not all patterns are ready for full skill extraction. When a discovery meets some but not all quality criteria (e.g., observed once but not yet verified across contexts), Claudeception saves it as a **tentative note** — a lightweight YAML file in `memory/tentative/` with a confidence score.

- **Confidence scoring**: starts at 0.4, increases with repeated observations (+0.15 same context, +0.20 different context) and user confirmation (+0.30), decreases with counter-examples (−0.20)
- **Automatic promotion**: when confidence reaches 0.7 with 2+ observations from distinct sessions, the note is suggested for promotion to a full skill
Copy link

Copilot AI Apr 5, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This section summarizes confidence/promotion, but it doesn’t mention the anti-gaming rule that user confirmations adjust confidence but do not count as observations. Since confirmation affects promotion eligibility in the rest of the docs, it’d help to add a short note here so readers don’t assume “1 observation + confirmation” satisfies the “2+ observations” requirement.

Suggested change
- **Automatic promotion**: when confidence reaches 0.7 with 2+ observations from distinct sessions, the note is suggested for promotion to a full skill
- **Automatic promotion**: when confidence reaches 0.7 with 2+ observations from distinct sessions, the note is suggested for promotion to a full skill; user confirmations can raise confidence but do **not** count as observations toward this requirement

Copilot uses AI. Check for mistakes.
Comment on lines +44 to +47
Both conditions must be met:
1. `confidence >= 0.7`
2. `observations.length >= 2`, from **>= 2 distinct sessions or dates**

Copy link

Copilot AI Apr 5, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Eligibility says observations must come from “>= 2 distinct sessions or dates”. This conflicts with the PR’s stated intent to require genuinely distinct sessions (not merely different dates), and “date” is a weak proxy (e.g., a single long session spanning midnight). Consider making “distinct sessions” the requirement and specifying how to identify sessions (e.g., session id/context field), with an explicit fallback if session metadata is missing.

Copilot uses AI. Check for mistakes.
# TEMPLATE — do not place in memory/tentative/
# Tentative Knowledge Note Template for Claudeception
# Storage: ~/.claude/projects/<project>/memory/tentative/<name>.yaml
# Created by claudeception when knowledge doesn't yet meet all 4 Quality Criteria
Copy link

Copilot AI Apr 5, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The header comment uses lowercase “claudeception” (“Created by claudeception…”), while the project/skill name is consistently “Claudeception” elsewhere in the docs. Consider capitalizing it here for consistency and to avoid confusion when users grep for the name.

Suggested change
# Created by claudeception when knowledge doesn't yet meet all 4 Quality Criteria
# Created by Claudeception when knowledge doesn't yet meet all 4 Quality Criteria

Copilot uses AI. Check for mistakes.
peiyuan-ran-huang and others added 3 commits April 5, 2026 21:39
Complete the EvoFSM entry with actual authors (Zhang et al.), publication
date (January 2026), and arXiv URL. This entry was previously a placeholder
([Research Team], 2024) and is now referenced by the Tentative Knowledge
Layer feature which draws on EvoFSM's experience pool concepts.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Replace hardcoded 2026-04-05 dates in instinct-template.yaml with
  YYYY-MM-DD placeholders matching skill-template.md convention
- Standardize on "tentative notes" terminology in SKILL.md description
  (drop "instinct" to match all body text consistently)
- Complete CASCADE citation with actual authors (Huang et al.) and
  correct publication date (December 2025, not 2024)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Fix CASCADE publication year in README.md: 2024 → 2025 (matches
  research-references.md and arXiv ID 2512.23880)
- Add missing triage row for edge case: Specific met but no other
  criterion shows even partial evidence → Discard
- Align SKILL.md Retrospective Mode expiry wording with
  tentative-knowledge.md: "flag for cleanup or auto-delete per expiry
  rules" (was: "flag for cleanup")

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants