subagentcontext

.com context management

← all concepts

What prompt caching is

prompt_caching ctx_prompt_caching_what
description
Prompt caching lets a client mark a prefix of a prompt (e.g. a long system prompt, a set of tool definitions, or a large shared document) as cacheable, so that repeated requests reusing that same prefix are billed and processed more cheaply and with lower latency than reprocessing it from scratch each time.
how it works

On a cache hit, the model does not need to reprocess the cached prefix's tokens the same way it would a fully fresh prompt — the provider serves the cached computation, and only the new (non-cached) portion of the prompt is processed at full cost/latency.

source note

Grounded in docs/docs/platform.claude.com/docs/en/build-with-claude/prompt-caching.md, mirrored in this repo per CLAUDE.md.

provenance

created 2026-07-02 08:27:03 · JSON