Guides
Insights
Use local Signals and optional manual advice to spot model, cache, anomaly, and quota patterns.
Insights
The Insights page surfaces deterministic Signals computed locally over your ingested usage. Signals do not make network calls and do not use an LLM — every rule reads the same ParsedCall records that drive the rest of the dashboard.
Open it in the TUI by pressing i from any page or by cycling tabs with Tab. In the desktop app, click the Insights tab or press i.
In the TUI, Insights has two internal views: Advice and Signals. Use Left / Right (or [ / ]) to switch views, Up / Down or the mouse wheel to scroll the active view, and PageUp / PageDown for larger jumps.
Each Signal card carries:
- a severity stripe (Risk / Warn / Info),
- a category label,
- the scope the rule fired against (project, tool, model, or session),
- an estimated savings in your currency where applicable, with the assumption stated below.
Manual LLM advice
The Insights page can also generate LLM Advice on demand. Advice is never triggered by the application. In the desktop app, click Generate Advice, then choose either a redacted summary or limited prompt snippets for that run. In the TUI, press a for redacted summary advice or A for prompt-snippet advice. Advice generation runs in the background; the status line shows when the job is running and updates when the saved run is available.
V1 supports Codex, Claude Code, and Gemini. Pick the advice tool from Config. Token Use runs the selected CLI from an app-owned working directory named Token Use App, then refreshes the archive so the call appears in normal usage data under that project label when the tool exposes local usage logs.
Advice runs and items are stored locally in archive.db (advice_runs and advice_items). Failed or invalid JSON responses are stored with raw output and error details, so previous advice can be reviewed instead of disappearing.
Prompt text lives in files, not Rust or Svelte code. Shipped templates are copied on first use into:
$CONFIG/tokenuse/advice-prompts/system.md
$CONFIG/tokenuse/advice-prompts/user-redacted.md
$CONFIG/tokenuse/advice-prompts/user-snippets.md
Those files are user-editable. Templates use variables such as {signals_json}, {pricing_context}, {data_scope}, {prompt_snippets_json}, and {output_schema}.
Accuracy guardrails:
- The LLM receives deterministic Signal evidence and pricing context; it explains, prioritizes, and proposes next steps.
- Advice items must cite local signal ids, sample counts, baseline windows, and confidence.
- Raw prompt snippets are included only when the per-run prompt-snippets option is selected.
Categories
Model right-sizing
| Rule | Trigger | Savings basis |
|---|---|---|
short_output_sonnet | ≥ 50 Sonnet calls in a project with output_tokens < 200, input_tokens < 4000, no reasoning, and ≥ 30% of the project’s Sonnet calls | Re-priced through claude-haiku-4-5, scaled to weekly |
fast_mode_opus_excess | Project’s fast-mode Opus spend > 2× standard-mode Opus and ≥ $5 | Each call’s fast-mode multiplier overhead, scaled to weekly |
reasoning_heavy_o_series | ≥ 20 Codex o-series calls with reasoning_tokens / output_tokens > 3 and reasoning > 40% of cost | Conservative half of the reasoning bill |
Cache efficiency
Cache rules run only against tools that report cache metrics: Claude Code, Codex, and Gemini. For Cursor and Copilot, an Info card explains that their local logs don’t expose cache buckets.
| Rule | Trigger | Savings basis |
|---|---|---|
cache_hit_trend_drop | 7-day hit rate vs prior 30-day baseline drops > 15 pp from a baseline ≥ 50% | Missing cache reads × (input − cache_read) per token |
cache_write_overhead | Cache write/read ratio > 0.5 across ≥ 100 events | Excess writes × (cache_write − cache_read) |
low_hit_project_outlier | Project hit rate < 0.5× the tool-wide median across ≥ 5 sessions | Gap to median × project input tokens × delta |
Anomalies
| Rule | Trigger | Savings basis |
|---|---|---|
outlier_session_cost | Session cost > P95 or Q3+1.5·IQR over the last 30 days, baseline ≥ 20 sessions | None (Info; click through to the Session page) |
day_over_day_spend_zscore | Today’s z-score > 2.5 vs the trailing-30-day mean (≥ 14 non-zero days) | None |
project_mom_growth | Project up > 50% MoM, ≥ $10 in the current month, both months ≥ 10 calls | None |
Quota / pacing
Reads LimitSnapshot::primary from the existing Usage data flow.
| Rule | Trigger | Savings basis |
|---|---|---|
claude_weekly_forecast | Projected weekly use ≥ 90% (Risk at ≥ 100%) | None |
copilot_premium_pacing | Projected cycle use ≥ 80% | None |
limit_recently_hit | rate_limit_reached_type set within the last 24h | None (Risk) |
Limitations
- No latency data. No “use a faster model for the same cost” recommendations. Don’t add latency rules without first ingesting latency.
- Cursor and Copilot cache. Their local logs don’t expose
cache_read/cache_creationtokens, so cache rules can’t compute savings for them. The Info card makes that gap visible. - Signal cards are deterministic. They regenerate from current data on every render. Advice item workflow state is stored separately in
archive.db. - Estimated savings are heuristic. The assumption line on each card states the basis (target model, multiplier, conservative haircut) so you can sanity-check before acting.
Architecture pointers
- Engine entry:
crate::insights::compute_insights - Manual advice:
src/advice.rs - Per-rule modules:
src/insights/rules/{model_rightsizing,cache,anomalies,quota}.rs - Statistical baselines:
src/insights/baselines.rs(in-memory only) - Tauri snapshot fields:
DesktopSnapshot.insightsandDesktopSnapshot.adviceindesktop/src-tauri/src/snapshot.rs - All copy lives under the top-level
insightsblock insrc/copy/copy.json