Skip to content

feat(library): add statistics dashboard panel#158

Merged
nikazzio merged 4 commits intomainfrom
feat/library-stats-dashboard-124
Apr 18, 2026
Merged

feat(library): add statistics dashboard panel#158
nikazzio merged 4 commits intomainfrom
feat/library-stats-dashboard-124

Conversation

@nikazzio
Copy link
Copy Markdown
Owner

Summary

  • New lazy-loaded statistics panel in the Library view, below the existing KPI strip.
  • Loaded via hx-get="/api/library/stats" hx-trigger="load" so the main Library page is not blocked.
  • Stats always reflect the full collection, independent of active filters.

Metrics shown

Metric Source
Pagine scaricate (count + %) downloaded_canvases / total_canvases from vault
Pagine trascritte (count + %) Per-manuscript transcription.json, full_text field
Pagine OCR (count + %) Same, filtered by is_manual: false
Spazio disco Sum of file sizes under each local_path
Distribuzione per biblioteca Provider breakdown, top 8, pure-CSS percentage bars

Implementation

  • src/studio_ui/components/library_stats.py — new component (175 LOC)
  • GET /api/library/stats — new route, registered in library.py
  • library_handlers.library_stats_panel() — new handler
  • library.py component — HTMX placeholder div added between KPI strip and filters

Test plan

  • ruff check — clean
  • ruff check --select C901 — all functions under complexity 10
  • 415 tests passed, 0 failed

Closes #124

Add a lazy-loaded statistics panel to the Library view that shows
aggregate metrics across the entire manuscript collection.

- new component library_stats.py with render_library_stats()
- metrics: downloaded pages (count + %), transcribed pages, OCR pages,
  disk usage (human-readable bytes)
- provider distribution via pure-CSS percentage bars, top 8 libraries
- transcription and OCR coverage computed by scanning per-manuscript
  transcription.json files (engine + is_manual fields)
- disk usage computed by summing file sizes under each local_path
- loaded lazily via hx-get="/api/library/stats" hx-trigger="load"
  so the main Library page is not blocked
- new GET /api/library/stats route; stats always reflect the full
  collection regardless of active filters

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@nikazzio nikazzio added area:library Library local assets and catalog views minor Increments the minor version when adding new functionality in a backward-compatible manner. priority:P2 Medium priority semver:none No release impact by itself type:feature New user-facing feature labels Apr 17, 2026
@codecov-commenter
Copy link
Copy Markdown

codecov-commenter commented Apr 17, 2026

Codecov Report

❌ Patch coverage is 0% with 158 lines in your changes missing coverage. Please review.
✅ Project coverage is 69.90%. Comparing base (00bf98a) to head (5cde216).

Files with missing lines Patch % Lines
src/studio_ui/components/library_stats.py 0.00% 133 Missing ⚠️
src/studio_ui/routes/stats_handlers.py 0.00% 17 Missing ⚠️
src/studio_ui/routes/stats.py 0.00% 5 Missing ⚠️
src/studio_app.py 0.00% 3 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main     #158      +/-   ##
==========================================
- Coverage   70.95%   69.90%   -1.06%     
==========================================
  Files         153      156       +3     
  Lines       13229    13437     +208     
==========================================
+ Hits         9387     9393       +6     
- Misses       3842     4044     +202     
Flag Coverage Δ
fast 69.88% <0.00%> (-1.06%) ⬇️
slow 69.90% <0.00%> (-1.06%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

Removes inline stats from the Library page and introduces:
- Compact DB-only widget in sidebar footer (mss count, pages, % local)
  loaded lazily via /api/stats/sidebar; hidden when sidebar is collapsed
- Dedicated /stats route with fast DB metrics (manuscript count, pages,
  provider distribution, recent activity) plus lazy /api/stats/detail
  panel for slow disk + transcription scans
- 📊 Statistiche nav item added to sidebar

Closes #124

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds a new Statistics area to the Studio UI, including a dedicated /stats page and a lazy-loaded sidebar “nerd stats” widget, to surface collection-wide library metrics without blocking initial page render.

Changes:

  • Introduce /stats page with fast (DB-only) metrics and a lazy-loaded “detail” panel.
  • Add /api/stats/sidebar (sidebar widget) and /api/stats/detail (disk/transcription scan) endpoints.
  • Add a “Statistiche” entry to the main sidebar nav and register stats routes in the app.

Reviewed changes

Copilot reviewed 5 out of 5 changed files in this pull request and generated 7 comments.

Show a summary per file
File Description
src/studio_ui/routes/stats_handlers.py Implements handlers for stats page, sidebar widget fragment, and detail fragment.
src/studio_ui/routes/stats.py Registers /stats and /api/stats/* endpoints.
src/studio_ui/components/library_stats.py Implements UI components + filesystem/JSON scanning helpers for stats metrics.
src/studio_ui/components/layout.py Adds “Statistiche” nav item and an HTMX placeholder to load the sidebar stats widget.
src/studio_app.py Registers stats routes during app startup.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +56 to +72
dt = datetime.fromisoformat(str(ts_str).replace("Z", "+00:00"))
delta = datetime.now(timezone.utc) - dt.astimezone(timezone.utc)
days = delta.days
if days == 0:
hours = delta.seconds // 3600
return "poco fa" if hours == 0 else f"{hours}h fa"
if days == 1:
return "ieri"
if days < 7:
return f"{days}g fa"
if days < 30:
return f"{days // 7}sett fa"
if days < 365:
return f"{days // 30}m fa"
return f"{days // 365}a fa"
except Exception:
return "—"
Copy link

Copilot AI Apr 17, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

_time_ago() treats SQLite updated_at values like YYYY-MM-DD HH:MM:SS as naive datetimes, then converts them with astimezone(timezone.utc), which interprets the naive value as local time and can shift the relative time display. Consider detecting naive timestamps and explicitly treating them as UTC (e.g., attach timezone.utc before computing the delta).

Copilot uses AI. Check for mistakes.
Comment on lines +109 to +113
def _dir_size(path: Path) -> int:
total = 0
with suppress(OSError):
total = sum(f.stat().st_size for f in path.rglob("*") if f.is_file())
return total
Copy link

Copilot AI Apr 17, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

_dir_size() suppresses a single OSError for the whole directory scan; if any file stat fails (transient delete, permission, etc.), the function returns 0 and disk usage becomes wildly inaccurate. It would be more robust to handle OSError per-file (skip unreadable files) so partial failures don't zero-out the entire directory size.

Copilot uses AI. Check for mistakes.
Comment on lines +116 to +128
def _scan_disk_usage(manuscripts: list[dict]) -> int:
"""Return total bytes used across all local manuscript directories."""
total = 0
seen: set[str] = set()
for m in manuscripts:
lp = m.get("local_path")
if not lp or lp in seen:
continue
seen.add(lp)
p = Path(lp)
if p.exists():
total += _dir_size(p)
return total
Copy link

Copilot AI Apr 17, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

_scan_disk_usage() trusts local_path from the DB and will rglob() any existing path. If the DB becomes corrupted (or a future migration writes unexpected values), this endpoint could end up scanning outside the downloads directory (very slow and potentially leaking information via aggregated sizes). VaultManager’s delete_manuscript() explicitly guards local_path to be under the configured downloads dir; it would be good to apply the same safety check here (resolve path and skip anything outside downloads base).

Copilot uses AI. Check for mistakes.
Comment on lines +6 to +10
def setup_stats_routes(app):
"""Register statistics page and API routes."""
app.get("/stats")(stats_handlers.stats_page)
app.get("/api/stats/sidebar")(stats_handlers.stats_sidebar_widget)
app.get("/api/stats/detail")(stats_handlers.stats_detail_content)
Copy link

Copilot AI Apr 17, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The PR description/Issue #124 describe adding a Library view stats panel lazy-loaded from /api/library/stats (or /library/stats), but this change introduces a separate /stats page and /api/stats/* endpoints instead. Please reconcile the implementation with the stated route(s)/placement (either update the PR description/issue references, or add the Library panel + routes as described).

Copilot uses AI. Check for mistakes.
Comment thread src/studio_ui/routes/stats_handlers.py Outdated
Comment on lines +31 to +34
def stats_detail_content():
"""Return the lazy-loaded detail metrics panel (disk + transcription scan)."""
manuscripts = VaultManager().get_all_manuscripts()
return render_library_stats(manuscripts)
Copy link

Copilot AI Apr 17, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/api/stats/detail performs filesystem scans (disk usage via rglob, plus per-manuscript JSON reads) on every load. Even though it's lazy-loaded, this can still be expensive for large libraries and can regress the <500ms target mentioned in Issue #124. Consider caching the computed stats with a short TTL (in-memory or persisted) and/or incrementally updating from DB events to avoid repeated full scans.

Copilot uses AI. Check for mistakes.
Comment on lines +6 to +10
def setup_stats_routes(app):
"""Register statistics page and API routes."""
app.get("/stats")(stats_handlers.stats_page)
app.get("/api/stats/sidebar")(stats_handlers.stats_sidebar_widget)
app.get("/api/stats/detail")(stats_handlers.stats_detail_content)
Copy link

Copilot AI Apr 17, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There’s good test coverage for other route modules (Discovery/Export/Library/etc.), but the new /stats page and /api/stats/sidebar + /api/stats/detail endpoints don’t appear to have tests. Adding route/handler tests would help prevent regressions (e.g., HX vs full-page behavior, and that sidebar/detail endpoints return expected fragments).

Copilot uses AI. Check for mistakes.
Comment on lines +221 to +225
top = sorted(provider_counts.items(), key=lambda x: -x[1])[:10]
provider_panel = Div(
P("Distribuzione per biblioteca", cls=_SECTION_LABEL_CLS),
Div(*[_provider_bar_row(n, c, total) for n, c in top], cls="flex flex-col gap-2"),
cls=_CARD_CLS + " mb-6",
Copy link

Copilot AI Apr 17, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The PR description says the provider breakdown shows the “top 8”, but the implementation slices [:10]. Please align the code with the intended number (or update the description) so the UI/expectations match.

Copilot uses AI. Check for mistakes.
nikazzio and others added 2 commits April 18, 2026 19:47
- _time_ago: treat naive SQLite timestamps as UTC (not local time)
- _dir_size: handle OSError per-file so partial failures don't zero total
- _scan_disk_usage: guard local_path against paths outside downloads dir
- stats_detail_content: add 5-min in-memory TTL cache to avoid repeated scans
- tests: add 33 unit tests covering helpers, components, and route handlers

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
On fresh CI containers the monotonic clock can be < 300 s, making
timestamp 0.0 appear within the TTL window and the cache valid.
Seed with (now - TTL - 1) instead to guarantee expiry regardless
of uptime.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@nikazzio nikazzio merged commit 0671163 into main Apr 18, 2026
6 checks passed
@nikazzio nikazzio deleted the feat/library-stats-dashboard-124 branch April 18, 2026 19:03
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area:library Library local assets and catalog views minor Increments the minor version when adding new functionality in a backward-compatible manner. priority:P2 Medium priority semver:none No release impact by itself type:feature New user-facing feature

Projects

None yet

Development

Successfully merging this pull request may close these issues.

feat: library statistics dashboard

3 participants