Benchmarks
llm-usage-metrics is engineered for extreme throughput. In production environments, it consistently outperforms legacy tools by an order of magnitude, enabling real-time cost analysis even on massive datasets.
Performance Summary
Section titled “Performance Summary”- Codex-only parity run:
4.60xfaster cold and22.78xfaster cached. - Multi-source OpenAI run:
3.63xfaster cold and17.75xfaster cached. - Sub-second cached reporting in both scenarios (
0.746scodex-only,0.941smulti-source).
Production benchmark (February 27, 2026)
Section titled “Production benchmark (February 27, 2026)”This benchmark compares:
ccusage-codex monthly(Codex-only baseline)llm-usage monthly --provider openai --source codex(direct source-to-source parity)llm-usage monthly --provider openai --source pi,codex,gemini,opencode(multi-source OpenAI scope)
Both commands were executed on the same machine, against real local production data, with repeated timed runs.
You can reproduce the direct source-to-source benchmark with:
pnpm run perf:production-benchmark -- \ --runs 5 \ --llm-source codexYou can reproduce the multi-source OpenAI benchmark with:
pnpm run perf:production-benchmark -- \ --runs 5 \ --llm-source pi,codex,gemini,opencodeTo export reusable artifacts:
pnpm run perf:production-benchmark -- \ --runs 5 \ --llm-source codex \ --json-output ./tmp/production-benchmark-openai-codex.json \ --markdown-output ./tmp/production-benchmark-openai-codex.md
pnpm run perf:production-benchmark -- \ --runs 5 \ --llm-source pi,codex,gemini,opencode \ --json-output ./tmp/production-benchmark-openai-multi-source.json \ --markdown-output ./tmp/production-benchmark-openai-multi-source.mdBaseline machine
Section titled “Baseline machine”| Spec | Value |
|---|---|
| OS | CachyOS (Linux 6.19.2-2-cachyos) |
| CPU | Intel Core Ultra 9 185H (22 logical CPUs, up to 5.10 GHz) |
| Memory | 62 GiB RAM + 62 GiB swap |
| Storage | NVMe SSD (KXG8AZNV1T02 LA KIOXIA, 953.9 GiB) |
| Node.js | v24.12.0 |
| pnpm | 10.17.1 |
ccusage-codex | 18.0.8 |
llm-usage (llm-usage-metrics) | 0.3.4 |
Cache modes used
Section titled “Cache modes used”no cache- fresh
XDG_CACHE_HOMEfor each run ccusage-codex:--no-offlinellm-usage:LLM_USAGE_PARSE_CACHE_ENABLED=0and no--pricing-offline
- fresh
with cache- dedicated warm cache directory
ccusage-codex:--offlinellm-usage:--pricing-offlinewith warmed parse cache
For repeatability, LLM_USAGE_SKIP_UPDATE_CHECK=1 was set for llm-usage benchmark runs.
Commands benchmarked
Section titled “Commands benchmarked”# ccusage-codexccusage-codex monthly --jsonccusage-codex monthly --offline --json
# llm-usage-metrics direct parity (codex only)llm-usage monthly --provider openai --source codex --jsonllm-usage monthly --provider openai --source codex --pricing-offline --json
# llm-usage-metrics multi-source (openai provider)llm-usage monthly --provider openai --source pi,codex,gemini,opencode --jsonllm-usage monthly --provider openai --source pi,codex,gemini,opencode --pricing-offline --jsonRuntime results (5 runs each): direct source-to-source (--source codex)
Section titled “Runtime results (5 runs each): direct source-to-source (--source codex)”| Tool | Cache mode | Median (s) | Mean (s) | Min (s) | Max (s) |
|---|---|---|---|---|---|
ccusage-codex monthly | no cache | 16.785 | 17.288 | 16.35 | 19.363 |
ccusage-codex monthly --offline | with cache | 16.995 | 17.594 | 16.462 | 19.909 |
llm-usage monthly --provider openai --source codex | no cache | 3.651 | 3.76 | 3.526 | 4.411 |
llm-usage monthly --provider openai --source codex --pricing-offline | with cache | 0.746 | 0.724 | 0.618 | 0.81 |
Derived from median runtime:
llm-usagevsccusage-codex(no cache):4.60xfasterllm-usagevsccusage-codex(with cache):22.78xfasterllm-usagecache effect:4.89xfaster with cacheccusage-codexcache effect:0.99xfaster with cache
Runtime results (5 runs each): multi-source OpenAI (--source pi,codex,gemini,opencode)
Section titled “Runtime results (5 runs each): multi-source OpenAI (--source pi,codex,gemini,opencode)”| Tool | Cache mode | Median (s) | Mean (s) | Min (s) | Max (s) |
|---|---|---|---|---|---|
ccusage-codex monthly | no cache | 17.297 | 17.463 | 16.76 | 18.634 |
ccusage-codex monthly --offline | with cache | 16.698 | 16.745 | 16.204 | 17.17 |
llm-usage monthly --provider openai --source pi,codex,gemini,opencode | no cache | 4.767 | 4.864 | 4.544 | 5.229 |
llm-usage monthly --provider openai --source pi,codex,gemini,opencode --pricing-offline | with cache | 0.941 | 0.951 | 0.912 | 1.004 |
Derived from median runtime:
llm-usagevsccusage-codex(no cache):3.63xfasterllm-usagevsccusage-codex(with cache):17.75xfasterllm-usagecache effect:5.07xfaster with cacheccusage-codexcache effect:1.04xfaster with cache
Dataset scope observed during benchmark
Section titled “Dataset scope observed during benchmark”These commands do not cover identical scope, so compare runtime with that context.
| Tool | Scope snapshot from benchmark run |
|---|---|
llm-usage monthly --provider openai --source codex | Direct codex-only scope parity with ccusage-codex monthly |
llm-usage monthly --provider openai --source pi,codex,gemini,opencode | Multi-source OpenAI scope across four adapters |
ccusage-codex monthly | Codex-only report (monthly array plus totals) |
Interpretation
Section titled “Interpretation”llm-usageremains substantially faster in both parity and multi-source comparisons.llm-usagebenefits strongly from parse + pricing cache in repeated runs.ccusage-codexruntime remains similar between--no-offlineand--offlineon this dataset.- Results are production-real for this machine and data, not universal constants. Re-run on your own workload before making broad conclusions.