Context Caching

Storing the computed representation of a prompt prefix for reuse across multiple requests sharing the same prefix. Context caching reduces processing time and cost for repeated system prompts or shared document contexts. Available in several commercial LLM APIs.