Markdown view
# Prompt caching 101 Reduce latency and cost with cache-safe prompt blocks. - Date: Oct 10, 2024 - Reading time: 10 min - Level: Intermediate - Tags: Latency, Caching, Optimization ## Takeaways - Cache stable prompt prefixes and templates. - Include tool schemas in cache keys. - Invalidate caches on prompt or policy changes. ## Cacheable blocks Split prompts into stable and variable blocks. Cache the stable portions. ## Cache key design Include model, prompt version, and tool schema hashes in cache keys. ## Invalidate safely Invalidate caches when prompts, tools, or policy rules change.