Development
Tool Ingestion
How tokenuse discovers, validates, parses, deduplicates, and prices local AI tool records.
Tool Ingestion
tokenuse reads usage data directly from local files written by AI coding tools. There is no proxy, no platform API key, no telemetry endpoint, and no live watcher. Optional quota sync actions can write local limit sidecars under the tokenuse config directory; Copilot quota sync uses the existing local Copilot OAuth token, while Claude.ai and ChatGPT (Codex) quota sync use a session cookie you store in the OS keychain. All three are opt-in and triggered explicitly from the Config page.
The UI calls these sources tools. Internally each one is implemented as a ToolAdapter under src/tools/<name>/.
Supported Tools
| Tool | Status | Source format | Token quality | Doc |
|---|---|---|---|---|
| Claude Code | implemented | JSONL session files under ~/.claude/projects/, Claude Desktop agent sessions, optional status-line limits sidecar | exact usage, cache reads/writes, tool calls, file-backed 5h/weekly limit snapshots | claude-code.md |
| Cursor | implemented | SQLite state.vscdb and ~/.cursor/projects/**/agent-transcripts | exact when tokenCount exists; estimated fallback otherwise | cursor.md |
| Codex | implemented | JSONL rollouts under ~/.codex/sessions/ | exact per-turn token-count deltas | codex.md |
| GitHub Copilot | implemented | JSONL events from legacy CLI, VS Code Copilot Chat transcripts, optional quota sidecar | legacy output exact when present; transcripts estimated; quota snapshots from confirmed local sync | copilot.md |
| Gemini | implemented | JSON/JSONL chat sessions under ~/.gemini/tmp/<project_hash>/chats/ | exact usage, cache reads, thoughts, tool calls | gemini.md |
| Claude.ai subscription | implemented (limits-only) | sidecar written by opt-in Config-page sync of claude.ai/api/organizations/{uuid}/usage and /overage_spend_limit | exact 5h / 7d / Opus / Sonnet / Extra Usage gauges; rendered inside the Claude Code section | claude-subscription.md |
| ChatGPT (Codex) subscription | implemented (limits-only) | sidecar written by opt-in Config-page sync of chatgpt.com/backend-api/wham/usage | exact 5h / 7d / credits gauges; rendered inside the Codex section | codex-subscription.md |
Data Path
The same seen: &mut HashSet<String> is shared across every tool adapter during one sync, so re-reading the same local record only contributes once. The archive also enforces uniqueness on (tool, dedup_key), which lets changed sources be reparsed without duplicating historical calls.
Internal Adapter Contract
All tool adapters implement the same trait in src/tools/mod.rs:
pub trait ToolAdapter: Send + Sync {
fn id(&self) -> &'static str;
fn display_name(&self) -> &'static str;
fn discover(&self) -> Result<Vec<SessionSource>>;
fn parse(
&self,
source: &SessionSource,
seen: &mut HashSet<String>,
) -> Result<Vec<ParsedCall>>;
fn parse_limits(&self, source: &SessionSource) -> Result<Vec<LimitSnapshot>> { /* default */ }
fn source_fingerprint(&self, source: &SessionSource) -> Result<String> { /* default */ }
fn model_display(&self, model: &str) -> String { /* default */ }
fn tool_display(&self, tool: &str) -> String { /* default */ }
}
ParsedCall from src/tools/types.rs is the normalized record every adapter emits and every dashboard aggregator consumes. See architecture.md for field meanings and aggregation behavior.
Pricing
Pricing is embedded as two books: costs/pricing-upstream.json for broad upstream coverage and costs/pricing-overrides.json for official rows, aliases, tool-scoped rows, provenance, and effective dates. Usage ingestion never fetches pricing; the Config page can download local pricing-upstream.json and pricing-overrides.json books only after confirmation. See Pricing and cache rates for source evidence and tool-specific caveats.
cost = multiplier * (
input_tokens * input_rate
+ output_tokens * output_rate
+ cache_creation_input_tokens * cache_write_rate
+ cache_read_input_tokens * cache_read_rate
+ web_search_requests * web_search_rate
)
Model lookup canonicalizes model names, resolves tool-scoped aliases first, applies effective dates, then falls back through global aliases and prefix matches to a default Sonnet row. cursor-auto is a direct Cursor Auto pricing row. Claude Opus fast mode applies the row’s fast_multiplier.
Refresh the embedded maintainer books with:
cargo run -- --refresh-prices
Adding a New Tool
- Create
src/tools/<name>/{mod.rs, config.rs, discovery.rs, parser.rs}. - Put every path, env var, glob, SQL query, and source constant in
config.rs. - Implement
ToolAdapterinmod.rsand register it intools::registry(). - Add a variant to
app::Tool, update its label and cycle order, and updateingest::matches_tool. - Add display names in aggregation helpers such as
tool_short_labelwhen needed. - Override
source_fingerprintonly when the default file/directory metadata fingerprint is too broad or too narrow for the source. - Write
docs/development/tools/<name>.mdand add it to the supported tools table above. - Add parser tests for source validation, token mapping, deduplication, project detection, and tool/bash extraction.
Verification
cargo testruns parser unit tests, pricing lookup tests, aggregation tests, and render smoke tests.cargo runlaunches the TUI and falls back to sample data when the archive has no local calls.cargo run -- --list-projectssyncs the archive and prints normalized project/tool inventory rows for debugging source attribution.