Hot Cache

Type: concept Tags: llm-wiki, token-efficiency, claude-code

Summary

An optional hot.md file (~500 words) in the wiki root that stores the most recently relevant context. Used to reduce token usage on repeat queries in high-frequency workflows like executive assistants.

How It Works

  1. After answering a query, Claude saves key context to wiki/hot.md
  2. On the next query, Claude checks hot.md first before crawling the full index
  3. If the answer is in the hot cache, no further wiki pages need to be read

When to Use It

Good fit:

  • Executive assistant workflows with daily repeat context (e.g., current projects, recent decisions)
  • Any vault queried many times per session on overlapping topics

Not needed:

  • Reference vaults queried occasionally (e.g., YouTube transcript archives)
  • One-off research projects

Token Impact

The presenter noted a measurable reduction in tokens after switching from scattered context files to a hot cache in their executive assistant project (andrej-karpathy pattern).