Cache OpenAI

A simple caching layer for OpenAI API, designed to reduce redundant API calls and save time and costs. It works by intercepting API requests and storing their responses in a cache. When the same request is made again, the cached response is returned instead of making a new API call.

It supports text responses, binary responses (e.g. images), and streaming (SSE).

It is based on the cacheable library, which provides a simple interface for caching data with support for various storage backends (like in-memory, Redis, SQLite, etc). This allows you to easily integrate caching into your OpenAI API usage without having to manage the caching logic yourself.

Why cache the OpenAI API?

Cut costs — OpenAI bills per token. Repeated prompts during development, tests, or recurring user flows pay full price every time; cache hits cost nothing.
Cut latency — A local SQLite/Redis hit returns in milliseconds vs. seconds round-tripping to OpenAI. Especially impactful for long completions, image generation, and TTS.
Determinism & resilience — Re-running the same script gives the same answer (no flaky tests), and cached responses keep working during API outages, rate limits, or quota exhaustion.

You can use any Keyv storage backend (like Redis, filesystem, etc) to store the cached responses. See the Keyv documentation for more details on available storage options and how to set them up. In the example below, we use a SQLite database to persist the cache.

Installation

npm install @jeromeetienne/openai-cache

If you want to use the SQLite storage backend, you also need to install the @keyv/sqlite package:

npm install @keyv/sqlite

Usage

import OpenAI from "openai";
import OpenAICache from "@jeromeetienne/openai-cache"; 
import KeyvSqlite from '@keyv/sqlite';
import { Cacheable } from "cacheable";

// init a cacheable instance
// - here it is backed by a sqlite database, but you can use any Keyv storage backend (redis, filesystem, etc)
const sqlitePath = `sqlite://${__dirname}/.openai_cache.sqlite`;
const sqliteCache = new Cacheable({ secondary: new KeyvSqlite(sqlitePath) });

// init the OpenAICache with the cacheable instance
const openaiCache = new OpenAICache(sqliteCache);

// init the OpenAI client with the cache's fetch function
const client = new OpenAI({
  apiKey: process.env.OPENAI_API_KEY,
  fetch: openaiCache.getFetchFn(),
});

// now use it normally - responses will be cached in the sqlite database
const response = await client.responses.create({
  model: "gpt-4.1-mini",
  input: "Say hello in one short sentence.",
});

console.log(response.output_text);

Environment Variables

Variable	Values	Description
`OPENAI_CACHE`	`disabled`	Always-live: never read from the cache, always call the OpenAI API. Responses are still written to the cache. Useful for testing/debugging without changing code.
`OPENAI_CACHE`	`offline`	Cache-only: serve hits from the cache, but on a miss throw instead of making a live request. The mirror image of `disabled`. Useful for deterministic, zero-cost replay — a miss fails loudly so you know a request was not pre-recorded, rather than silently paying for it.

PRO/CON

PRO: Reduces redundant API calls, saving time and costs. data.
NOTE: When temperature === 0, caching works optimally as responses are deterministic. However, with temperature > 0, caching may reduce variety across multiple calls since identical prompts will return cached results instead of generating new varied responses.
NOTE: Only successful responses (2xx) are cached. Error responses (4xx/5xx) are returned normally but are not persisted.

Possible improvements

dont cache if temporature > 0 or top_p < 1, You’ll freeze randomness if cached
- NOTE: do that on options
add configurable cache policy for errors (for example, cache selected deterministic 4xx while never caching 429/5xx)
tools requests errors should not be cached

Developper Notes

Q. How to disable the cache ?

A. Set the OPENAI_CACHE environment variable to disabled:

OPENAI_CACHE=disabled node your_app.js

It will still write in the cache but will ignore the cached responses and always call the OpenAI API. This is useful for testing or debugging purposes when you want to bypass the cache without changing your code.

Q. How to run cache-only (offline) ?

A. Set the OPENAI_CACHE environment variable to offline:

OPENAI_CACHE=offline node your_app.js

It serves responses from the cache and throws on a miss instead of making a live request — the mirror image of disabled. This is useful for deterministic, zero-cost replay (CI, tests, demos): a miss fails loudly so you immediately know a request was not pre-recorded, rather than silently paying for a live call. To record a missing request, run once without OPENAI_CACHE=offline.

Q. How to know if a given call was a cache hit or miss?

A. You can enable the markResponseEnabled option when initializing the OpenAICache. When this option is enabled, the cache will add a custom property to the response object to indicate whether it was a cache hit or miss.

const openaiCache = new OpenAICache(sqliteCache, {
    markResponseEnabled: true, // default is false
});

// later, when you make a call, you can check the custom property to see if it was a cache hit or miss
const response = await client.responses.create({
    model: "gpt-4.1-mini",
    input: "Say hello in one short sentence.",
});

if (response.X_FROM_OPENAI_CACHE) {
    console.log("Cache hit!");
} else {
    console.log("Cache miss!");
}

Q. how to publish the package to npm?

A. Do the following steps:

npm run version:patch && npm run publish:all

Lots of trouble with the 2fa system

Revevant Documentation:

Name		Name	Last commit message	Last commit date
Latest commit History 98 Commits
.vscode		.vscode
docs		docs
examples		examples
src		src
test		test
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
TODO.md		TODO.md
package-lock.json		package-lock.json
package.json		package.json
tsconfig copy.json		tsconfig copy.json
tsconfig.json		tsconfig.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Cache OpenAI

Why cache the OpenAI API?

Installation

Usage

Environment Variables

PRO/CON

Possible improvements

Developper Notes

Q. How to disable the cache ?

Q. How to run cache-only (offline) ?

Q. How to know if a given call was a cache hit or miss?

Q. how to publish the package to npm?

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Cache OpenAI

Why cache the OpenAI API?

Installation

Usage

Environment Variables

PRO/CON

Possible improvements

Developper Notes

Q. How to disable the cache ?

Q. How to run cache-only (offline) ?

Q. How to know if a given call was a cache hit or miss?

Q. how to publish the package to npm?

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages