Open source · MIT · Zero hard dependencies

The runtime layer
your AI tools
don't have.

skim sits between Claude Code (or any LLM tool) and the API. It strips token waste in real-time, injects prompt caching automatically, and tells you live when your context window is filling up — without changing a single line of code.

★ Star on GitHub

96%

context reduction

50–90%

cost savings via caching

1 var

to activate everything

on cache hit calls

skim proxy

See your context window fill up — in real-time

Claude Code Pro users run into a hidden problem: context fills up, the model quietly starts forgetting earlier work, and response quality drops — with no signal. skim shows you live context fill % after every call and automatically strips waste before it counts. It also caches your system prompt so it costs nothing on calls 2+.

Install

Zero hard dependencies. Python 3.10+.

pip install skim-llm

Start the proxy

Runs locally on your machine.

skim proxy --port 7474 --path .

Activate

Every Claude Code call now goes through skim.

export ANTHROPIC_BASE_URL=http://localhost:7474

Read the docs →

Everything included

One install. The full stack.

⚡

Runtime proxy

Intercepts every API call. No code changes. Works with Claude Code, Cursor, and any OpenAI-compatible tool — one env var.

🧠

Prompt caching

Auto-injects cache_control into your system prompt. Anthropic caches it; subsequent calls read it for free. 50–90% savings on repeated context.

📊

Team dashboard

Per-user, per-team cost attribution, budget alerts, trend charts. Self-hosted. JWT auth + SSO / LDAP / Azure AD.

🔑

Secrets detection

Scans for AWS keys, OpenAI tokens, GitHub PATs, private keys. CI-ready — pass --fail to block builds with exposed credentials.

📐

Baseline regression

Save a token snapshot before a refactor. Compare after. Catch context bloat in PR review before it ships.

🔒

CI budget gate

skim check exits 1 if context exceeds your limit. Three lines in GitHub Actions. Hooks into pre-commit as well.

Start in 30 seconds

Open source. Self-hostable. No account required.

$ pip install skim-llm

⎘

$ export ANTHROPIC_BASE_URL=http://localhost:7474

⎘

View full docs on GitHub →

The runtime layeryour AI toolsdon't have.

See your context window fill up — in real-time

One install. The full stack.

Start in 30 seconds

The runtime layer
your AI tools
don't have.