Andrea Barghigiani

Prompt Caching

This lesson is a simple one, and generally you have nothing to do because all the major modern models implement it by default!

Prompt caching is the capability of a model to, indeed, cache tokens that they already saw and this is perfect in a chat context where the user will add new tokens to the conversation as long as they press Send.

As you should know by now, with each new message we also send all the previous messages and this is where the prompt caching magic happen!

The previous messages, while still increasing the context window, are billed in a different way (generally they’re cheaper).

Matt built a nice example where he leverages the Tiktoken package to show in the terminal what is being cached and what not. Just have a look at the lesson or run the example to see it live!


Andrea Barghigiani

Andrea Barghigiani

Frontend and Product Engineer in Palermo , He/Him

cupofcraft