Open source · MIT · Zero hard dependencies

The runtime layer
your AI tools
don't have.

skim sits between Claude Code (or any LLM tool) and the API. It strips token waste in real-time, injects prompt caching automatically, and tells you live when your context window is filling up — without changing a single line of code.

★ Star on GitHub
96%
context reduction
50–90%
cost savings via caching
1 var
to activate everything
$0
on cache hit calls
skim proxy

See your context window fill up — in real-time

Claude Code Pro users run into a hidden problem: context fills up, the model quietly starts forgetting earlier work, and response quality drops — with no signal. skim shows you live context fill % after every call and automatically strips waste before it counts. It also caches your system prompt so it costs nothing on calls 2+.

1
Install
Zero hard dependencies. Python 3.10+.
pip install skim-llm
2
Start the proxy
Runs locally on your machine.
skim proxy --port 7474 --path .
3
Activate
Every Claude Code call now goes through skim.
export ANTHROPIC_BASE_URL=http://localhost:7474
Read the docs →

Everything included

One install. The full stack.

Runtime proxy
Intercepts every API call. No code changes. Works with Claude Code, Cursor, and any OpenAI-compatible tool — one env var.
🧠
Prompt caching
Auto-injects cache_control into your system prompt. Anthropic caches it; subsequent calls read it for free. 50–90% savings on repeated context.
📊
Team dashboard
Per-user, per-team cost attribution, budget alerts, trend charts. Self-hosted. JWT auth + SSO / LDAP / Azure AD.
🔑
Secrets detection
Scans for AWS keys, OpenAI tokens, GitHub PATs, private keys. CI-ready — pass --fail to block builds with exposed credentials.
📐
Baseline regression
Save a token snapshot before a refactor. Compare after. Catch context bloat in PR review before it ships.
🔒
CI budget gate
skim check exits 1 if context exceeds your limit. Three lines in GitHub Actions. Hooks into pre-commit as well.

Start in 30 seconds

Open source. Self-hostable. No account required.

$ pip install skim-llm
$ export ANTHROPIC_BASE_URL=http://localhost:7474
View full docs on GitHub →