Skip to content

Benchmarks

llm-usage-metrics is engineered for extreme throughput. In production environments, it consistently outperforms legacy tools by an order of magnitude, enabling real-time cost analysis even on massive datasets.

  • Codex-only parity run: 4.60x faster cold and 22.78x faster cached.
  • Multi-source OpenAI run: 3.63x faster cold and 17.75x faster cached.
  • Sub-second cached reporting in both scenarios (0.746s codex-only, 0.941s multi-source).

This benchmark compares:

  • ccusage-codex monthly (Codex-only baseline)
  • llm-usage monthly --provider openai --source codex (direct source-to-source parity)
  • llm-usage monthly --provider openai --source pi,codex,gemini,opencode (multi-source OpenAI scope)

Both commands were executed on the same machine, against real local production data, with repeated timed runs.

You can reproduce the direct source-to-source benchmark with:

Terminal window
pnpm run perf:production-benchmark -- \
--runs 5 \
--llm-source codex

You can reproduce the multi-source OpenAI benchmark with:

Terminal window
pnpm run perf:production-benchmark -- \
--runs 5 \
--llm-source pi,codex,gemini,opencode

To export reusable artifacts:

Terminal window
pnpm run perf:production-benchmark -- \
--runs 5 \
--llm-source codex \
--json-output ./tmp/production-benchmark-openai-codex.json \
--markdown-output ./tmp/production-benchmark-openai-codex.md
pnpm run perf:production-benchmark -- \
--runs 5 \
--llm-source pi,codex,gemini,opencode \
--json-output ./tmp/production-benchmark-openai-multi-source.json \
--markdown-output ./tmp/production-benchmark-openai-multi-source.md
SpecValue
OSCachyOS (Linux 6.19.2-2-cachyos)
CPUIntel Core Ultra 9 185H (22 logical CPUs, up to 5.10 GHz)
Memory62 GiB RAM + 62 GiB swap
StorageNVMe SSD (KXG8AZNV1T02 LA KIOXIA, 953.9 GiB)
Node.jsv24.12.0
pnpm10.17.1
ccusage-codex18.0.8
llm-usage (llm-usage-metrics)0.3.4
  • no cache
    • fresh XDG_CACHE_HOME for each run
    • ccusage-codex: --no-offline
    • llm-usage: LLM_USAGE_PARSE_CACHE_ENABLED=0 and no --pricing-offline
  • with cache
    • dedicated warm cache directory
    • ccusage-codex: --offline
    • llm-usage: --pricing-offline with warmed parse cache

For repeatability, LLM_USAGE_SKIP_UPDATE_CHECK=1 was set for llm-usage benchmark runs.

Terminal window
# ccusage-codex
ccusage-codex monthly --json
ccusage-codex monthly --offline --json
# llm-usage-metrics direct parity (codex only)
llm-usage monthly --provider openai --source codex --json
llm-usage monthly --provider openai --source codex --pricing-offline --json
# llm-usage-metrics multi-source (openai provider)
llm-usage monthly --provider openai --source pi,codex,gemini,opencode --json
llm-usage monthly --provider openai --source pi,codex,gemini,opencode --pricing-offline --json

Runtime results (5 runs each): direct source-to-source (--source codex)

Section titled “Runtime results (5 runs each): direct source-to-source (--source codex)”
ToolCache modeMedian (s)Mean (s)Min (s)Max (s)
ccusage-codex monthlyno cache16.78517.28816.3519.363
ccusage-codex monthly --offlinewith cache16.99517.59416.46219.909
llm-usage monthly --provider openai --source codexno cache3.6513.763.5264.411
llm-usage monthly --provider openai --source codex --pricing-offlinewith cache0.7460.7240.6180.81

Derived from median runtime:

  • llm-usage vs ccusage-codex (no cache): 4.60x faster
  • llm-usage vs ccusage-codex (with cache): 22.78x faster
  • llm-usage cache effect: 4.89x faster with cache
  • ccusage-codex cache effect: 0.99x faster with cache

Runtime results (5 runs each): multi-source OpenAI (--source pi,codex,gemini,opencode)

Section titled “Runtime results (5 runs each): multi-source OpenAI (--source pi,codex,gemini,opencode)”
ToolCache modeMedian (s)Mean (s)Min (s)Max (s)
ccusage-codex monthlyno cache17.29717.46316.7618.634
ccusage-codex monthly --offlinewith cache16.69816.74516.20417.17
llm-usage monthly --provider openai --source pi,codex,gemini,opencodeno cache4.7674.8644.5445.229
llm-usage monthly --provider openai --source pi,codex,gemini,opencode --pricing-offlinewith cache0.9410.9510.9121.004

Derived from median runtime:

  • llm-usage vs ccusage-codex (no cache): 3.63x faster
  • llm-usage vs ccusage-codex (with cache): 17.75x faster
  • llm-usage cache effect: 5.07x faster with cache
  • ccusage-codex cache effect: 1.04x faster with cache

These commands do not cover identical scope, so compare runtime with that context.

ToolScope snapshot from benchmark run
llm-usage monthly --provider openai --source codexDirect codex-only scope parity with ccusage-codex monthly
llm-usage monthly --provider openai --source pi,codex,gemini,opencodeMulti-source OpenAI scope across four adapters
ccusage-codex monthlyCodex-only report (monthly array plus totals)
  • llm-usage remains substantially faster in both parity and multi-source comparisons.
  • llm-usage benefits strongly from parse + pricing cache in repeated runs.
  • ccusage-codex runtime remains similar between --no-offline and --offline on this dataset.
  • Results are production-real for this machine and data, not universal constants. Re-run on your own workload before making broad conclusions.