Token Efficiency
Type: concept Tags: llm-wiki, hot-cache, claude-code
Summary
The ability to answer questions using fewer tokens by organizing knowledge into navigable structure rather than feeding raw or unstructured data into context.
LLM Wiki Impact
The llm-wiki approach delivers significant token savings vs. dumping raw files into context:
One X user turned 383 scattered files and 100+ meeting transcripts into a compact wiki and dropped token usage by 95% when querying with Claude.
Why It’s More Efficient
- Claude reads
index.md→ follows 2–3 relevant links → answers — instead of loading all source material [[hot-cache]]further reduces tokens on repeat queries by caching recent context- Structured summaries in wiki pages are much shorter than raw source documents
Cost Model
- LLM Wiki cost: tokens only (reads index + N pages per query)
- RAG cost: ongoing embedding compute + vector DB storage + query latency