Prompt caching is functionally identical to snapshotting the model after it proc...

		int_19h 11 months ago \| parent \| context \| favorite \| on: Claude's system prompt is over 24k tokens with too... Prompt caching is functionally identical to snapshotting the model after it processed the prompt. And you need the KV cache for inference in any case so it doesn't even cost extra memory to keep it around, if every single inference task is going to have the same prompt suffix.