Token Efficiency

Type: concept Tags: llm-wiki, hot-cache, claude-code

Summary

The ability to answer questions using fewer tokens by organizing knowledge into navigable structure rather than feeding raw or unstructured data into context.

LLM Wiki Impact

The llm-wiki approach delivers significant token savings vs. dumping raw files into context:

One X user turned 383 scattered files and 100+ meeting transcripts into a compact wiki and dropped token usage by 95% when querying with Claude.

Why It’s More Efficient

Claude reads index.md → follows 2–3 relevant links → answers — instead of loading all source material
[[hot-cache]] further reduces tokens on repeat queries by caching recent context
Structured summaries in wiki pages are much shorter than raw source documents

Cost Model

LLM Wiki cost: tokens only (reads index + N pages per query)
RAG cost: ongoing embedding compute + vector DB storage + query latency

llm-wiki
hot-cache
llm-wiki-vs-rag
yt-llm-wiki-knowledge-system

Knowledge Wiki

Explorer

token-efficiency

Token Efficiency

Summary

LLM Wiki Impact

Why It’s More Efficient

Cost Model

Graph View

Table of Contents

Backlinks

Knowledge Wiki

Explorer

token-efficiency

Token Efficiency

Summary

LLM Wiki Impact

Why It’s More Efficient

Cost Model

Related

Graph View

Table of Contents

Backlinks