Skip to content

Render ScheduleWakeup and Cron* tools (#148)#152

Merged
cboos merged 8 commits into
mainfrom
dev/cron-tool-rendering
May 10, 2026
Merged

Render ScheduleWakeup and Cron* tools (#148)#152
cboos merged 8 commits into
mainfrom
dev/cron-tool-rendering

Conversation

@cboos
Copy link
Copy Markdown
Collaborator

@cboos cboos commented May 10, 2026

Summary

Closes #148. Adds specialised rendering for the built-in scheduling tool family — ScheduleWakeup, CronCreate, CronList, CronDelete — which previously fell through to the generic params-table render. Same per-tool-family pattern as the Monitor tool work in #147.

Two commits, kept separate for the post-mortem story:

  • <sha1> — core implementation (4-tool family rendering + 24 regression tests). Models, factory parsers (with graceful-degradation fallbacks), HTML/Markdown formatters, title methods, and a JSONL fixture covering one call per tool. Shared _format_collapsible_prompt helper between ScheduleWakeup and CronCreate (both carry a long prompt field that benefits from the line-count-badge collapsibility used by Monitor's command).
  • <sha2> — fix-up (CronList HTML title duplication). title_CronListInput originally passed "CronList" as the summary to _tool_title, which already emits the tool name from content.tool_name — duplicating it in the HTML output. Fix is one line; tests tightened with a cardinality check (html.count("CronList") == 1) and a shape-specific check ("<span class='tool-summary'>CronList" not in html) to catch the same regression class via different mechanisms.

Defensive scaffolding heads-up

ScheduleWakeup is the only tool with real fixtures derived from actual transcripts. CronCreate / CronList / CronDelete fixtures are constructed against the documented schemas — no real invocations exist in the project's transcripts yet (the tools are listed as deferred but haven't been used in the wild). The output parsers degrade gracefully: every output dataclass carries a text field even when structured parsing succeeds, and renderers fall back to raw text whenever parsing yields no structured fields. So when real invocations show up and the format turns out to differ from the fixture, the rendering will visibly degrade to text rather than crash. Worth a follow-up to tighten the parsers when real Cron* output lands.

Test plan

  • just test: 1661 pass, 7 skipped (was 1635 on main; +26 new)
  • just test-tui: 67 pass
  • just test-browser: 39 pass, 1 skipped
  • just ci: ruff + pyright + ty all green
  • New regression suite test/test_cron_rendering.py (24 cases):
    • 4 input model tests
    • 8 output-parser tests (happy-path + graceful-degradation for each tool)
    • 6 HTML formatter unit tests (grids, collapsibility, optional-flag omission, structured-vs-fallback CronList)
    • 6 fixture-driven end-to-end tests (real renderers against test/test_data/cron_tools.jsonl, asserting titles + grids + structured cron list + Markdown inline-code escaping)
  • No snapshot drift — no existing fixtures touch these tools

🤖 Generated with Claude Code

Summary by CodeRabbit

  • New Features

    • Added ScheduleWakeup and typed cron tools (create, list, delete) with structured, user-friendly status outputs.
  • UI / Rendering

    • Improved HTML/Markdown presentation: icon titles, collapsible prompt bodies, compact job tables, and backlinking from job IDs to their creation cards.
  • Tests

    • Added unit and end-to-end tests with fixture coverage for scheduling and cron flows.
  • Documentation

    • Clarified transcript icon/emoji handling and widened emoji recognition ranges.

Review Change Stack

@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented May 10, 2026

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

  • ▶️ Resume reviews
  • 🔍 Trigger review
📝 Walkthrough

Walkthrough

Adds ScheduleWakeup, CronCreate, CronList, and CronDelete tool support across models, regex-based output parsers and registrations, HTML/Markdown formatters and renderer dispatch, emoji-range adjustment, CSS/snapshots, renderer cross-linking for cron backlinks, and tests with a JSONL fixture.

Changes

Scheduling & Cron Tool Support Pipeline

Layer / File(s) Summary
Tool Data Models
claude_code_log/models.py
Adds Pydantic input models (ScheduleWakeupInput, CronCreateInput, CronListInput, CronDeleteInput) and dataclass output models (ScheduleWakeupOutput, CronCreateOutput, CronListOutput, CronDeleteOutput, CronListItem). Updates ToolInput and ToolOutput unions.
Output Parsing & Registration
claude_code_log/factories/tool_factory.py
Registers new tools in TOOL_INPUT_MODELS and TOOL_OUTPUT_PARSERS. Implements parse_schedulewakeup_output (extracts next_at/in_seconds), parse_croncreate_output (optional job_id), parse_cronlist_output (parses job lines into CronListItem entries while preserving raw text or falling back to jobs=[]), and parse_crondelete_output (parses job_id and preserves stripped text, returns None for empty).
HTML Formatters
claude_code_log/html/tool_formatters.py
Adds formatter functions for schedule/cron tools, including collapsible prompt rendering and status output helpers; CronListOutput renders as a compact table or <pre> fallback; updates __all__.
HTML Renderer Dispatch
claude_code_log/html/renderer.py
Extends HtmlRenderer with format_* methods delegating to HTML formatter helpers and title_* methods producing icon-prefixed titles with tool-specific summaries (delay/reason, cron expression, id).
Markdown Renderer Dispatch
claude_code_log/markdown/renderer.py
Extends MarkdownRenderer with format_* methods rendering raw input.prompt bodies and verbatim output.text, CronListOutput as bullet summaries or fenced fallback, and title_* methods wrapping parameters in inline code.
Emoji Utils & Skill Doc
claude_code_log/html/utils.py, .claude/skills/tool-renderer/SKILL.md
starts_with_emoji now recognizes U+2300–U+23FF; SKILL.md documents emoji-range gating for the template wrench prefix.
CSS & Snapshots
claude_code_log/html/templates/components/message_styles.css, test/__snapshots__/test_snapshot_html.ambr
Adds .cronlist-output-table and .cron-id-backlink styles and updates snapshots to include the new CSS rules.
Renderer Cross-linking
claude_code_log/renderer.py
Adds _link_cron_jobs_by_id and invokes it in generate_template_messages() to annotate CronList job entries and CronDelete outputs with creating_call_message_index when a corresponding CronCreate job_id is found.
Tests & Fixture
test/test_cron_rendering.py, test/test_data/cron_tools.jsonl
Adds unit tests for input validation, parser extraction and fallback, HTML/Markdown formatter behavior (collapsible prompts, table vs <pre>), end-to-end rendering tests using the JSONL transcript, and a fixture that skips end-to-end tests when the JSONL file is missing.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Possibly related PRs

  • daaain/claude-code-log#37: Related changes to renderer pairing/tool-use metadata; this PR adds _link_cron_jobs_by_id and calls it from generate_template_messages.
  • daaain/claude-code-log#87: Overlapping work adding tool models, parsers, and formatters across the same files.
  • daaain/claude-code-log#71: Prior renderer/formatter changes affecting HtmlRenderer and formatter structure.

Suggested reviewers

  • daaain

Poem

🐰 I hopped through code to add four tools,
Schedules set and crons in queues,
Parsers, HTML, Markdown too,
Links and tests to see them through —
Hooray, the job list now has views!

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 69.88% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The pull request title 'Render ScheduleWakeup and Cron* tools (#148)' clearly summarizes the main change: adding rendering support for four scheduling/cron tools across the codebase.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch dev/cron-tool-rendering

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@claude_code_log/factories/tool_factory.py`:
- Around line 829-838: The code treats any trailing bracketed text captured in
m.group("flags") as flags and strips it from the prompt, which drops legitimate
prompt suffixes; to fix, validate and parse m.group("flags") before removing it
from the prompt: inspect the tokens derived in flag_tokens and only
consume/strip bracketed text when at least one token matches known flags (e.g.,
"recurring", "durable"); otherwise leave the original prompt intact when
constructing the CronListItem (update the prompt assignment and the
recurring/durable logic that reads flag_tokens so prompt suffixes that are not
real flags are preserved).
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 215a9a84-6704-4a67-88c2-58f66308ae57

📥 Commits

Reviewing files that changed from the base of the PR and between eac597b and 20451bc.

📒 Files selected for processing (7)
  • claude_code_log/factories/tool_factory.py
  • claude_code_log/html/renderer.py
  • claude_code_log/html/tool_formatters.py
  • claude_code_log/markdown/renderer.py
  • claude_code_log/models.py
  • test/test_cron_rendering.py
  • test/test_data/cron_tools.jsonl

Comment thread claude_code_log/factories/tool_factory.py Outdated
@cboos
Copy link
Copy Markdown
Collaborator Author

cboos commented May 10, 2026

(Claude) Status update on the latest CodeRabbit review:

One actionable finding appliedc4166ff.

Real bug in parse_cronlist_output: the optional [flags] group at the end of _CRON_LIST_LINE_RE greedily captured any trailing bracketed text, including legitimate prompt suffixes. A row like - cj_xyz: 0 9 * * * => /loop tick [important context] parsed with prompt='/loop tick' and silently dropped the [important context] suffix.

Fix follows CR's suggestion: validate captured flags against a known-flags whitelist (_KNOWN_CRON_FLAGS = {'recurring', 'durable'}). When none of the captured tokens match a known flag, the bracketed content is restored to the prompt and the row renders as flag-less.

Two new regression tests in TestSchedulingOutputParsers:

  • test_cronlist_preserves_bracketed_prompt_suffix — pins the exact CR scenario.
  • test_cronlist_keeps_real_flags_alongside_known_tokens — confirms recognised flags still parse normally.

While I was in the parser, also folded in a stylistic micro-nit from the earlier monk review on this branch (msg #2779): True or NoneTrue if X else None for self-documentation. Same behaviour, clearer intent.

CI on alice (pre-push): test_cron_rendering.py 26/26 pass; ruff + pyright clean on touched files.

Thanks — the bracketed-suffix catch is a nice one. The original parser shape was overly trusting of the [...] syntax matching only flag-like content.

cboos added a commit that referenced this pull request May 10, 2026
CR caught a real bug in ``parse_cronlist_output``: the optional
``[flags]`` group at the end of ``_CRON_LIST_LINE_RE`` greedily
captured *any* trailing bracketed text, including legitimate prompt
suffixes. A row like::

    - cj_xyz: 0 9 * * * => /loop tick [important context]

parsed with ``prompt='/loop tick'`` and ``flags='important context'``
— silently dropping the suffix and producing nonsense recurring /
durable values.

## Fix

Validate captured flags against a known-flags whitelist
(``_KNOWN_CRON_FLAGS = {'recurring', 'durable'}``). When none of the
captured tokens match a known flag, restore the bracketed content to
the prompt and treat the row as flag-less.

When at least one token matches a known flag, accept the bracket as
flags (CR explicitly endorsed this trade-off — unknown stragglers
inside an otherwise-known flag list are dropped). The whitelist
keeps the contract honest: only documented flags are real flags.

## Folded in: monk's #148 micro-nit

While in the parser, replaced ``True or None`` / ``False or None``
with the more self-documenting ``True if X else None`` per monk's
review nit on #148 (msg #2779). Same behaviour, clearer intent.

## Tests

Two new cases in ``TestSchedulingOutputParsers``:

- ``test_cronlist_preserves_bracketed_prompt_suffix`` — pins the
  exact CR scenario; pre-fix would have failed by stripping
  ``[important context]`` from the prompt.
- ``test_cronlist_keeps_real_flags_alongside_known_tokens`` —
  confirms recognised flags still parse correctly when the prompt
  body itself contains other punctuation.

CI: 26/26 ``test_cron_rendering.py`` pass; ruff + pyright clean.
cboos added a commit that referenced this pull request May 10, 2026
Three concrete improvements after rendering against a real-data
sample (live CronCreate / CronList / CronDelete experiment in a
session, captured before this commit):

## 1. Fix the wrench prefix on ⏰-titled tools (``starts_with_emoji``)

The transcript template prepends ``🛠️`` to a tool-use title unless
the title already starts with an emoji recognised by
``html/utils.py::starts_with_emoji``. That function whitelists
specific Unicode ranges, but the ranges miss the **Misc Technical**
block (U+2300–U+23FF) where ``⏰`` (U+23F0) lives. So titles like
``⏰ ScheduleWakeup …`` rendered as ``🛠️ ⏰ ScheduleWakeup …``.

Added the Misc Technical range to the whitelist (along with a
docstring callout listing all glyphs that benefit:
``⏰ ⏳ ⏱️ ⏲️ ⏸ ⏹ ⏺ ⏏``). Other tools using emoji from this block
(currently none in the codebase, but several candidates in the
range) get the suppression for free.

## 2. ScheduleWakeup / CronCreate body = prompt only

Both tools' titles already carry the most informative scalars
(``+<delay>s — <reason>`` for ScheduleWakeup, ``<cron>`` for
CronCreate). The body's previous params-table repeated the same
values plus a ``recurring`` / ``durable`` row that the harness's
own confirmation echoes back in human-readable form
(``Scheduled recurring job <id> ... Session-only ...``). Drop the
table entirely; render just the prompt as collapsible code via the
existing ``_format_collapsible_prompt`` helper.

The ``_row_html`` helper that backed the table is now dead — removed.

## 3. Parser regexes match the harness's actual output format

A live experiment (CronCreate every 2 minutes, CronList, CronDelete
within minutes) showed my synthetic fixtures and parser regexes
were misaligned with what the harness really emits:

- **CronCreate**: ``Scheduled cron job <id>`` → real:
  ``Scheduled recurring job <id> (Every 2 minutes). Session-only…``
  (or ``one-shot`` for non-recurring). Updated regex to accept
  both ``recurring`` and ``one-shot`` kinds.
- **CronList row**: ``- <id>: <cron> => <prompt> [flags]`` → real:
  ``<id> — <description> (recurring) [session-only]: <prompt>…``.
  No leading hyphen, em-dash separator, parens for the kind,
  brackets for the scope, no cron expression (only a human-readable
  ``description``), prompt often truncated by the harness with
  ``…``. Regex completely rewritten; ``CronListItem.cron`` field
  renamed to ``description`` to reflect the actual content.
- **CronDelete**: ``Deleted cron job <id>`` → real:
  ``Cancelled job <id>.``. Parser unchanged (still text-verbatim);
  fixture and tests updated.

The previous fix for the bracketed-prompt-suffix bug (CR finding on
PR #152) and the ``_KNOWN_CRON_FLAGS`` whitelist are dropped because
the new format separates flags (parens) from prompts (after a
colon) — there's no greedy-bracket vulnerability anymore.

Fixture ``test/test_data/cron_tools.jsonl`` updated with real
harness text. Tests rewritten against the actual format with two
extra edge cases:

- ``test_croncreate_extracts_job_id_oneshot`` — confirms the
  ``one-shot`` kind variant.
- ``test_cronlist_durable_scope_sets_durable_flag`` — confirms the
  durable scope (separate from the recurring/one-shot kind).

## 4. tool-renderer skill updated

Added a "Watch out: the template's wrench-suppression is
emoji-range-gated" section under "Add Title Method" with the full
list of recognised Unicode ranges and the recipe for verifying
your icon (render fixture + grep for ``🛠️`` co-occurring) /
extending the range when your icon falls outside.

## CI

- ``just test``: 1664 pass, 7 skipped (was 1659 with the prior
  bracketed-suffix tests; the net +5 reflects the test rewrite —
  removed 2 obsolete bracket-handling tests, added 5 real-format
  tests including the wrench-suppression regression guard).
- ruff + pyright clean.
- Fixture rendered locally to verify: ``🛠️ ⏰`` co-occurrence is
  zero (the suppression works on the real fixture).
@cboos
Copy link
Copy Markdown
Collaborator Author

cboos commented May 10, 2026

(Claude) Polish round on user feedback — 6f34c54.

Three concrete changes after rendering against real-data fixtures (live CronCreate / CronList / CronDelete experiment in a session, captured before this commit):

1. Wrench prefix gone from ⏰-titled tools. The transcript template prepends 🛠️ unless the title already starts with an emoji recognised by html/utils.py::starts_with_emoji. That function whitelisted several Unicode ranges but missed the Misc Technical block (U+2300–U+23FF) where (U+23F0) lives — so all four scheduling tools rendered as 🛠️ ⏰ <name>. Added the missing range to the whitelist.

2. ScheduleWakeup / CronCreate body = prompt only. Both tools' titles already carry the most informative scalars (+<delay>s — <reason> for ScheduleWakeup, <cron> for CronCreate). The previous params-table body repeated them and added a recurring/durable row that the harness's own confirmation echoes back in human form anyway. Body is now just the collapsible prompt via the existing _format_collapsible_prompt helper. The unused _row_html helper that backed the table is removed.

3. Parser regexes updated to match real harness output. The live experiment showed my synthetic fixtures and parser regexes were misaligned with what the harness actually emits:

  • CronCreate: Scheduled cron job <id> → real: Scheduled recurring job <id> (Every 2 minutes). Session-only… (or one-shot for non-recurring).
  • CronList row: - <id>: <cron> => <prompt> [flags] → real: <id> — <description> (recurring) [session-only]: <prompt>…. No leading hyphen, em-dash separator, parens for kind, brackets for scope, no cron expression (only a human-readable description), prompt often truncated by the harness with . Regex rewritten; CronListItem.cron field renamed to description to reflect actual content.
  • CronDelete: Deleted cron job <id> → real: Cancelled job <id>.. Parser unchanged (still text-verbatim); fixture and tests updated.

The previous fix for the bracketed-prompt-suffix bug (CR finding earlier on this PR) and the _KNOWN_CRON_FLAGS whitelist are dropped — the new format separates flags (parens) from prompts (after a colon), so there's no greedy-bracket vulnerability anymore.

Bonus: tool-renderer skill updated. Added a "Watch out: the template's wrench-suppression is emoji-range-gated" section listing all recognised Unicode ranges and the verification recipe. Future tools using emoji from outside the whitelist will hit the same trap I did, so the skill now warns about it explicitly.

CI: 1664 unit pass, ruff + pyright clean. Test count net change reflects the rewrite (removed 2 obsolete bracket-handling tests, added 5 real-format tests including a wrench-suppression regression guard).

Fixture rendered locally to verify: 🛠️ ⏰ co-occurrence is zero across the rendered HTML.

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (1)
test/test_cron_rendering.py (1)

143-180: ⚡ Quick win

Add a mixed-format CronList regression case.

These tests cover all-valid and all-invalid CronList text, but not a mixed payload (one valid row + one drifted row). That case is important to prevent partial-table rendering from hiding unparsed content.

Suggested test addition
 class TestSchedulingOutputParsers:
@@
     def test_cronlist_falls_back_on_unrecognised_format(self) -> None:
         out = parse_cronlist_output(
             _result("Just a free-form summary, no rows here."), None
         )
         assert out is not None
         assert out.jobs == []
         assert "free-form summary" in out.text
+
+    def test_cronlist_mixed_lines_falls_back_to_raw_text(self) -> None:
+        text = "\n".join(
+            [
+                "abc — Daily at 9am (recurring) [durable]: /morning-checkin",
+                "unexpected trailer line that does not match cron row format",
+            ]
+        )
+        out = parse_cronlist_output(_result(text), None)
+        assert out is not None
+        assert out.jobs == []
+        assert "unexpected trailer line" in out.text
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@test/test_cron_rendering.py` around lines 143 - 180, Add a regression test to
test_cron_rendering.py that feeds parse_cronlist_output a mixed payload
containing one valid cron row and one unrecognised/free-form line to ensure
partial parsing doesn't drop unparsed content; use the existing helper
_result(...) to wrap the string, call parse_cronlist_output, assert the returned
value is a CronListOutput with one parsed job (check job.id/job.description or
other fields) and that out.text still contains the unparsed free-form line, and
mirror the style of existing tests like test_cronlist_parses_real_format and
test_cronlist_falls_back_on_unrecognised_format so it fits with
parse_cronlist_output and CronListOutput expectations.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@claude_code_log/factories/tool_factory.py`:
- Around line 838-853: The parser currently skips non-empty lines that don't
match _CRON_LIST_LINE_RE, which causes partial parses to hide rows; change the
loop that builds jobs so unmatched non-empty lines are preserved instead of
ignored—when m is None and line.strip() is non-empty, append a CronListItem (or
similar sentinel) that preserves the original line (e.g., id=None,
description=line.strip(), and a parsing_error or raw=True flag) so the UI can
show those rows or fall back to text mode; ensure CronListOutput(text=text,
jobs=jobs) still returns all lines and that any new fields on CronListItem are
optional to avoid breaking downstream consumers.

---

Nitpick comments:
In `@test/test_cron_rendering.py`:
- Around line 143-180: Add a regression test to test_cron_rendering.py that
feeds parse_cronlist_output a mixed payload containing one valid cron row and
one unrecognised/free-form line to ensure partial parsing doesn't drop unparsed
content; use the existing helper _result(...) to wrap the string, call
parse_cronlist_output, assert the returned value is a CronListOutput with one
parsed job (check job.id/job.description or other fields) and that out.text
still contains the unparsed free-form line, and mirror the style of existing
tests like test_cronlist_parses_real_format and
test_cronlist_falls_back_on_unrecognised_format so it fits with
parse_cronlist_output and CronListOutput expectations.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 6023af08-af5b-4436-9937-1b5771245fc6

📥 Commits

Reviewing files that changed from the base of the PR and between c4166ff and 6f34c54.

📒 Files selected for processing (8)
  • .claude/skills/tool-renderer/SKILL.md
  • claude_code_log/factories/tool_factory.py
  • claude_code_log/html/tool_formatters.py
  • claude_code_log/html/utils.py
  • claude_code_log/markdown/renderer.py
  • claude_code_log/models.py
  • test/test_cron_rendering.py
  • test/test_data/cron_tools.jsonl
✅ Files skipped from review due to trivial changes (2)
  • .claude/skills/tool-renderer/SKILL.md
  • test/test_data/cron_tools.jsonl
🚧 Files skipped from review as they are similar to previous changes (2)
  • claude_code_log/models.py
  • claude_code_log/markdown/renderer.py

Comment thread claude_code_log/factories/tool_factory.py
@cboos
Copy link
Copy Markdown
Collaborator Author

cboos commented May 10, 2026

(Claude) Round-4 — 678b9ea. CR's nitpick + actionable both addressed.

Applied the bail-out approach: parse_cronlist_output now returns jobs=[] if any non-empty line fails the row regex, triggering the renderer's existing "no jobs → raw text" fallback for the whole list. This avoids the partial-table failure mode where a 9-of-10-rows match would silently hide the 10th unparsed line behind structured chrome.

Blank lines are still skipped without bailing (carve-out for whitespace; verified by a separate test).

Skipped the alternative actionable suggestion (preserve unparsed rows as sentinel CronListItem entries with a raw=True flag). Bail-out is simpler — no new dataclass field, single return path — and matches the existing fallback semantics at list granularity. Either shape addresses the underlying issue.

CR's exact suggested test (test_cronlist_mixed_lines_falls_back_to_raw_text) lands verbatim. Added a companion test_cronlist_blank_lines_dont_bail_out to lock in the whitespace carve-out.

CI: 29/29 in test_cron_rendering.py; ruff + pyright clean.

@cboos
Copy link
Copy Markdown
Collaborator Author

cboos commented May 10, 2026

(Claude) Visual polish round on user feedback after rendering against the real fixture — 7f4ff89.

1. ScheduleWakeup / CronCreate prompts render as Markdown. The prompt is typically Markdown content (slash commands, inline code, prose) rather than preformatted text. HTML side switches from _format_collapsible_prompt (which wrapped in <pre>) to render_markdown_collapsible; Markdown side stops wrapping in _code_fence and emits the prompt directly. The unused _format_collapsible_prompt helper is removed.

2. Light borders + padded cells on CronList table. Default unstyled tables read as raw rows pasted into the page; new CSS class .cronlist-output-table matches the visual weight of the sibling .system-content / .command-output-content panels (#137 lineage).

3. Cross-link CronList row ids and CronDelete result ids back to the originating CronCreate card. New renderer pass _link_cron_jobs_by_id builds a job_id → message_index map from CronCreate outputs and stamps creating_call_message_index on each consumer (CronListItem, CronDeleteOutput). HTML formatters wrap the rendered id in <a class='cron-id-backlink'> when set.

Same pattern as PR #147's Monitor → TaskNotification backlink, but in the reverse direction (consumer → originating call) and matched by job id rather than tool_use_id. Markdown side keeps the inline-code id without a clickable link (per the established finding from #147 — Markdown only has session-level anchors, no message-level targets).

4. CronDelete parser extracts the cancelled job id so the cross-link pass has a key. New _CRON_DELETE_ID_RE and CronDeleteOutput.job_id field; falls back to None for unrecognised formats.

The cross-link affordance is subtle (dotted underline, inherited colour, accent on hover) — visible to anyone scanning for affordances but doesn't draw attention away from the structured content.

Tests: 2 new cross-link assertions (test_html_cronlist_id_links_back_to_croncreate, test_html_crondelete_result_links_back_to_croncreate), plus updates to the prompt-rendering / formatter assertions for the new shapes. Replaced the brittle html.count("CronList") == 1 assertion (CSS comments now legitimately mention the tool name) with the more specific "<span class='tool-summary'>CronList" not in html shape check — same regression class, no false positives from documentation strings.

Snapshot drift is 308 lines added across 8 fixtures — all CSS additions for the new .cronlist-output-table and .cron-id-backlink rules. No content drift.

CI: 1667 unit pass, ruff + pyright clean.

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@claude_code_log/html/templates/components/message_styles.css`:
- Line 485: The CSS uses the keyword value "currentColor" which violates the
stylelint rule expecting lowercase; update the declaration that sets
border-bottom: 1px dotted currentColor; to use the lowercase keyword
"currentcolor" (i.e., change currentColor → currentcolor) so the value casing
matches lint rules and other declarations that reference the same keyword.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: d120c426-63d9-4670-804c-b4552ad6f9be

📥 Commits

Reviewing files that changed from the base of the PR and between 678b9ea and 7f4ff89.

📒 Files selected for processing (8)
  • claude_code_log/factories/tool_factory.py
  • claude_code_log/html/templates/components/message_styles.css
  • claude_code_log/html/tool_formatters.py
  • claude_code_log/markdown/renderer.py
  • claude_code_log/models.py
  • claude_code_log/renderer.py
  • test/__snapshots__/test_snapshot_html.ambr
  • test/test_cron_rendering.py
🚧 Files skipped from review as they are similar to previous changes (1)
  • claude_code_log/factories/tool_factory.py

Comment thread claude_code_log/html/templates/components/message_styles.css Outdated
cboos added 7 commits May 10, 2026 17:59
The built-in scheduling tool family — ``ScheduleWakeup``,
``CronCreate``, ``CronList``, ``CronDelete`` — used to fall through
to the generic params-table render. Add specialised rendering across
HTML and Markdown for all four, following the same recipe as the
``Monitor`` tool work in #142/PR #147.

## Models

- ``ScheduleWakeupInput`` (Pydantic): ``delaySeconds`` / ``reason``
  / ``prompt``. Used by ``/loop`` dynamic mode for self-paced ticks.
- ``CronCreateInput`` (Pydantic): ``cron`` / ``prompt`` (required) +
  ``recurring`` / ``durable`` (optional). Schedules a prompt at a
  cron-matched time.
- ``CronListInput`` (Pydantic, no fields): kept as an explicit model
  so dispatch routes through the typed-input pipeline rather than
  the generic fallback.
- ``CronDeleteInput`` (Pydantic): ``id``.
- Output dataclasses: ``ScheduleWakeupOutput`` (parsed clock /
  delay), ``CronCreateOutput`` (parsed job_id), ``CronListOutput``
  (parsed list of ``CronListItem`` rows), ``CronDeleteOutput``
  (text only). Each captures the raw text as fallback so the
  renderer always has something to show even when the harness's
  exact format drifts.

## Factory parsers

Output parsers extract structured fields when the format matches and
fall back gracefully when it doesn't. Char classes are tolerant
(``[\\w-]+`` for ids per monk's #147 nit). The cron-list row regex
allows the prompt body to contain ``=>`` and the optional
``[recurring, durable]`` flag tokens; lines that don't match are
quietly skipped, leaving the raw text available as fallback.

## HTML formatters

A small shared helper, ``_format_collapsible_prompt``, renders the
``prompt`` field with the same line-count-badge collapsibility as
the Monitor tool's ``command`` field — multi-line / long prompts
collapse via ``render_collapsible_code``, short single-line prompts
render in a plain ``<pre>``. Used by both ``ScheduleWakeup`` and
``CronCreate``. Per-tool formatters compose:

- ``format_schedulewakeup_input``: 3-row grid, prompt collapsible.
- ``format_croncreate_input``: cron + optional flags + collapsible
  prompt. Optional flags only render when explicitly set (the
  harness's defaults are near-universal; showing ``recurring=True``
  on every row would be noise).
- ``format_cronlist_input``: empty body — title is sufficient.
- ``format_cronlist_output``: structured table when ``jobs`` is
  populated, plain ``<pre>`` fallback otherwise. Prompt previews
  truncated at 80 chars in the list view since the full prompt
  lives in the originating ``CronCreate`` card upstream.
- ``format_crondelete_input``: empty body — id is in the title.
- All output formatters render the harness status line verbatim in
  a typed ``<div>`` (matching the Monitor output shape).

## HTML titles

All four tools share the ``⏰`` (alarm clock) icon for the family.
Per-tool summary in the title so the reader can identify the call
without expanding:

- ``⏰ ScheduleWakeup +<delay>s — <reason>``
- ``⏰ CronCreate <cron>``
- ``⏰ CronList``
- ``⏰ CronDelete <id>``

## Markdown rendering

Mirrors the HTML shape in plain Markdown:

- Bullet list of scalar fields, prompt in a fenced ``_code_fence``
  block (adaptive width, handles backticks).
- Titles wrap user-supplied values in ``_inline_code`` per the same
  ``WebSearch`` / ``WebFetch`` / ``Skill`` / ``Monitor`` precedent —
  protects against markdown emphasis / heading metacharacters in
  reasons / cron expressions / ids.
- ``CronList`` body renders structured rows as a bullet list when
  parseable, fenced raw text otherwise.

## Tests (24 new)

``test/test_cron_rendering.py``:

- ``TestSchedulingInputModels`` (4) — the four input models.
- ``TestSchedulingOutputParsers`` (8) — happy path + fallback for
  each parser, including the cron-list "unrecognised format" path
  that exercises the graceful-degradation branch.
- ``TestSchedulingHtmlFormatters`` (6) — grid completeness,
  long-prompt collapsibility, optional-flag omission, structured-
  vs-fallback CronList rendering.
- ``TestSchedulingFixtureRendering`` (6) — drives the real
  HTML/Markdown renderers against a JSONL fixture with one call per
  tool in the family. Asserts titles, grids, the structured cron
  list, and the markdown inline-code escaping for user-supplied
  values (reason, cron, id).

Fixture ``test/test_data/cron_tools.jsonl``: ScheduleWakeup →
CronCreate → CronList → CronDelete in sequence; ScheduleWakeup
adapted from a real invocation in alice's transcripts, the Cron*
calls constructed against the documented schemas (no real Cron*
invocations exist in any of my transcripts yet — the tools are
listed as deferred but haven't been used in the wild).

## CI

- ``just test``: 1659 pass, 7 skipped (was 1635; +24 new).
- ``just test-tui``: 67 pass.
- ``just test-browser``: 39 pass, 1 skipped.
- ruff + pyright clean on production files.
- No snapshot drift — no existing fixtures carry these tools.
``title_CronListInput`` passed the tool name ("CronList") as the
``summary`` argument to ``_tool_title``. But ``_tool_title`` already
emits the message's tool name (``content.tool_name``) ahead of any
summary span — passing the same value as the summary therefore
rendered ``CronList`` twice in the HTML title:

    🛠️ ⏰ CronList <span class='tool-summary'>CronList</span>

Drop the literal summary so the no-summary branch fires, matching
the docstring's intent and the Markdown title shape (``⏰ CronList``).
The other three tools in the family pass distinct summaries (delay,
cron expression, id), so they don't trigger the duplication — the
guard is specifically for zero-input tools whose title summary
would naturally be the tool name.

## Test tightening

The original assertion was ``"CronList" in html``, which is satisfied
by both the correct (one occurrence) and broken (two occurrences)
output. Tightened to:

- ``html.count("CronList") == 1`` — pins the cardinality.
- ``"<span class='tool-summary'>CronList" not in html`` — pins the
  absence of the duplicate-summary shape specifically.

Both assertions catch the same regression class; either alone would
fail loudly if a future zero-input-tool title accidentally passes
its name as the summary.

CI: ``test_cron_rendering.py`` 24/24 still pass; ruff + pyright
clean.
CR caught a real bug in ``parse_cronlist_output``: the optional
``[flags]`` group at the end of ``_CRON_LIST_LINE_RE`` greedily
captured *any* trailing bracketed text, including legitimate prompt
suffixes. A row like::

    - cj_xyz: 0 9 * * * => /loop tick [important context]

parsed with ``prompt='/loop tick'`` and ``flags='important context'``
— silently dropping the suffix and producing nonsense recurring /
durable values.

## Fix

Validate captured flags against a known-flags whitelist
(``_KNOWN_CRON_FLAGS = {'recurring', 'durable'}``). When none of the
captured tokens match a known flag, restore the bracketed content to
the prompt and treat the row as flag-less.

When at least one token matches a known flag, accept the bracket as
flags (CR explicitly endorsed this trade-off — unknown stragglers
inside an otherwise-known flag list are dropped). The whitelist
keeps the contract honest: only documented flags are real flags.

## Folded in: monk's #148 micro-nit

While in the parser, replaced ``True or None`` / ``False or None``
with the more self-documenting ``True if X else None`` per monk's
review nit on #148 (msg #2779). Same behaviour, clearer intent.

## Tests

Two new cases in ``TestSchedulingOutputParsers``:

- ``test_cronlist_preserves_bracketed_prompt_suffix`` — pins the
  exact CR scenario; pre-fix would have failed by stripping
  ``[important context]`` from the prompt.
- ``test_cronlist_keeps_real_flags_alongside_known_tokens`` —
  confirms recognised flags still parse correctly when the prompt
  body itself contains other punctuation.

CI: 26/26 ``test_cron_rendering.py`` pass; ruff + pyright clean.
Three concrete improvements after rendering against a real-data
sample (live CronCreate / CronList / CronDelete experiment in a
session, captured before this commit):

## 1. Fix the wrench prefix on ⏰-titled tools (``starts_with_emoji``)

The transcript template prepends ``🛠️`` to a tool-use title unless
the title already starts with an emoji recognised by
``html/utils.py::starts_with_emoji``. That function whitelists
specific Unicode ranges, but the ranges miss the **Misc Technical**
block (U+2300–U+23FF) where ``⏰`` (U+23F0) lives. So titles like
``⏰ ScheduleWakeup …`` rendered as ``🛠️ ⏰ ScheduleWakeup …``.

Added the Misc Technical range to the whitelist (along with a
docstring callout listing all glyphs that benefit:
``⏰ ⏳ ⏱️ ⏲️ ⏸ ⏹ ⏺ ⏏``). Other tools using emoji from this block
(currently none in the codebase, but several candidates in the
range) get the suppression for free.

## 2. ScheduleWakeup / CronCreate body = prompt only

Both tools' titles already carry the most informative scalars
(``+<delay>s — <reason>`` for ScheduleWakeup, ``<cron>`` for
CronCreate). The body's previous params-table repeated the same
values plus a ``recurring`` / ``durable`` row that the harness's
own confirmation echoes back in human-readable form
(``Scheduled recurring job <id> ... Session-only ...``). Drop the
table entirely; render just the prompt as collapsible code via the
existing ``_format_collapsible_prompt`` helper.

The ``_row_html`` helper that backed the table is now dead — removed.

## 3. Parser regexes match the harness's actual output format

A live experiment (CronCreate every 2 minutes, CronList, CronDelete
within minutes) showed my synthetic fixtures and parser regexes
were misaligned with what the harness really emits:

- **CronCreate**: ``Scheduled cron job <id>`` → real:
  ``Scheduled recurring job <id> (Every 2 minutes). Session-only…``
  (or ``one-shot`` for non-recurring). Updated regex to accept
  both ``recurring`` and ``one-shot`` kinds.
- **CronList row**: ``- <id>: <cron> => <prompt> [flags]`` → real:
  ``<id> — <description> (recurring) [session-only]: <prompt>…``.
  No leading hyphen, em-dash separator, parens for the kind,
  brackets for the scope, no cron expression (only a human-readable
  ``description``), prompt often truncated by the harness with
  ``…``. Regex completely rewritten; ``CronListItem.cron`` field
  renamed to ``description`` to reflect the actual content.
- **CronDelete**: ``Deleted cron job <id>`` → real:
  ``Cancelled job <id>.``. Parser unchanged (still text-verbatim);
  fixture and tests updated.

The previous fix for the bracketed-prompt-suffix bug (CR finding on
PR #152) and the ``_KNOWN_CRON_FLAGS`` whitelist are dropped because
the new format separates flags (parens) from prompts (after a
colon) — there's no greedy-bracket vulnerability anymore.

Fixture ``test/test_data/cron_tools.jsonl`` updated with real
harness text. Tests rewritten against the actual format with two
extra edge cases:

- ``test_croncreate_extracts_job_id_oneshot`` — confirms the
  ``one-shot`` kind variant.
- ``test_cronlist_durable_scope_sets_durable_flag`` — confirms the
  durable scope (separate from the recurring/one-shot kind).

## 4. tool-renderer skill updated

Added a "Watch out: the template's wrench-suppression is
emoji-range-gated" section under "Add Title Method" with the full
list of recognised Unicode ranges and the recipe for verifying
your icon (render fixture + grep for ``🛠️`` co-occurring) /
extending the range when your icon falls outside.

## CI

- ``just test``: 1664 pass, 7 skipped (was 1659 with the prior
  bracketed-suffix tests; the net +5 reflects the test rewrite —
  removed 2 obsolete bracket-handling tests, added 5 real-format
  tests including the wrench-suppression regression guard).
- ruff + pyright clean.
- Fixture rendered locally to verify: ``🛠️ ⏰`` co-occurrence is
  zero (the suppression works on the real fixture).
Partial-table failure mode: the per-line ``continue`` in
``parse_cronlist_output`` would silently drop unparsed lines from
the structured output. A 9-of-10-rows CronList payload with one
drifted format would render as a 9-row table, hiding the 10th
line behind structured chrome.

Tighten to all-or-nothing: if any non-empty line fails the row
regex, return ``jobs=[]`` so the renderer's existing
``if not output.jobs:`` branch fires and shows the raw text. This
matches the documented "structured-when-parseable, raw text
otherwise" semantics — at the granularity of the whole list, not
per-line.

Blank lines are skipped without triggering the bail-out (they're
noise, not data). The real harness output is single-line per row
with no headers (verified against the live experiment), so the
bail-out doesn't fire on real data today; it kicks in only when
the format drifts in a way the parser doesn't recognise.

## Tests

- ``test_cronlist_mixed_lines_falls_back_to_raw_text`` — pins the
  partial-parse bail-out behaviour (CR's nitpick test, mirrored
  verbatim).
- ``test_cronlist_blank_lines_dont_bail_out`` — guards the blank-
  line carve-out so the bail-out doesn't trigger on whitespace.

## Skipped

CR's actionable suggested an alternative: preserve unparsed rows
as sentinel ``CronListItem`` entries with a ``raw=True`` flag. The
bail-out approach is simpler (no new dataclass field, single
return path) and matches the existing fallback semantics. Either
shape addresses the partial-table issue; bail-out is the cheaper
fix.

CI: ``test_cron_rendering.py`` 29/29 pass; ruff + pyright clean.
User feedback after rendering against the real-data fixture:

1. **ScheduleWakeup / CronCreate prompts render as Markdown.** The
   prompt is typically Markdown content (slash commands, inline
   code, prose) rather than preformatted text. Switch HTML side from
   ``_format_collapsible_prompt`` (which wrapped in ``<pre>``) to
   ``render_markdown_collapsible``; switch Markdown side from
   ``self._code_fence(prompt)`` to emitting the prompt directly into
   the document so CommonMark renders it. Removed the now-unused
   ``_format_collapsible_prompt`` helper.

2. **Light borders + padded cells on CronList table.** Default
   unstyled tables read as raw rows pasted into the page; new CSS
   class ``.cronlist-output-table`` matches the visual weight of
   the sibling ``.system-content`` / ``.command-output-content``
   panels (#137 lineage).

3. **Cross-link CronList row ids and CronDelete result ids back to
   the originating CronCreate card.** New renderer pass
   ``_link_cron_jobs_by_id`` builds a ``job_id → message_index``
   map from CronCreate outputs (parsed by
   ``parse_croncreate_output``) and stamps
   ``creating_call_message_index`` on each consumer
   (``CronListItem``, ``CronDeleteOutput``). HTML formatters wrap
   the rendered id in an ``<a class='cron-id-backlink'>`` pointing
   at the card's anchor when the field is set; otherwise plain
   ``<code>`` as before.

   Same pattern as PR #147's Monitor → TaskNotification backlink,
   but in the reverse direction (consumer → originating call) and
   matched by job id rather than tool_use_id.

   Markdown side keeps the inline-code id without a clickable link
   (per the established finding from #147 — Markdown only has
   session-level anchors, no message-level targets).

4. **CronDelete parser extracts the cancelled job id.** Added
   ``_CRON_DELETE_ID_RE`` and ``CronDeleteOutput.job_id`` so the
   cross-link pass has a key to look up. Falls back to ``None`` for
   unrecognised formats; rendering degrades to plain status text.

## CronList CSS

```css
.cronlist-output-table {
    margin-top: 8px;
    border: 1px solid #11;
    border-radius: 4px;
    background-color: #1e1e1e08;
    border-collapse: collapse;
    font-size: 0.9em;
    overflow: hidden;
}
.cronlist-output-table th, .cronlist-output-table td {
    padding: 6px 12px;
    border-bottom: 1px solid #11;
}
.cronlist-output-table tr:last-child td { border-bottom: none; }
.cron-id-backlink {
    color: inherit;
    text-decoration: none;
    border-bottom: 1px dotted currentColor;
}
.cron-id-backlink:hover {
    color: var(--user-color);
    border-bottom-style: solid;
}
```

The cross-link affordance is subtle (dotted underline, inherited
colour) — visible to anyone scanning for affordances but doesn't
draw attention away from the structured row content.

## Tests

Updated assertions for the new shapes; added two cross-link tests:

- ``test_html_cronlist_id_links_back_to_croncreate`` — pins the
  anchor href against the CronCreate's ``msg-d-N`` id.
- ``test_html_crondelete_result_links_back_to_croncreate`` — pins
  the inline anchor wrapping the cancelled id within the status
  text.

Replaced the brittle ``html.count("CronList") == 1`` assertion (CSS
comments now legitimately mention the tool name) with the more
specific ``"<span class='tool-summary'>CronList" not in html``
shape check — same regression class, no false positives from
documentation strings.

Snapshot drift: 308 lines added to ``test_snapshot_html.ambr``,
all CSS additions across 8 fixtures (the new ``.cronlist-output-
table`` and ``.cron-id-backlink`` rules). No content drift.

## CI

- ``just test``: 1667 pass, 7 skipped (was 1664; +3 net from new
  tests minus the count assertion replacement).
- ruff + pyright clean.
Rebased onto current ``main`` (which advanced to ``7c2e6f6 Style
sidechain filter toggle with dashed border``). The new CSS rule
embeds in every rendered HTML, so all 7 snapshots gain a +4-line
addition for the ``.filter-toggle[data-type="sidechain"]`` rule.

Pure CSS-rule drift; no content drift. CI on the previous push
caught the rebase staleness — local snapshots had been updated
against the older ``message_styles.css`` set, but CI generated
HTML that included main's newer sidechain-filter CSS, producing
the diff. Updating the snapshot file against the current main
brings the two views back in sync.
@cboos cboos force-pushed the dev/cron-tool-rendering branch from 7f4ff89 to 0045805 Compare May 10, 2026 16:02
CodeRabbit's stylelint rule expects lowercase ``currentcolor``
(matching the CSS spec — keywords are case-insensitive but the
project lints for the canonical lowercase form). One occurrence in
the new ``.cron-id-backlink`` rule.

Snapshot drift is the same change × 7 fixtures — pure casing flip,
no semantic difference.
@cboos
Copy link
Copy Markdown
Collaborator Author

cboos commented May 10, 2026

(Claude) The latest CR review's actionable (currentColorcurrentcolor in message_styles.css:485) was already applied in 9946f0e — CR's review was generated against the previous commit (7f4ff89) before the fix landed. CI is now fully green on 9946f0e (CodeRabbit + 11 test matrix). Branch ready to merge.

@cboos cboos merged commit 293f4af into main May 10, 2026
11 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Add support for ScheduleWakeup and other Cron* tasks

1 participant