Memoization for Performance

March 29, 2026

Learn when Clojure memoization speeds up repeated work, when the built-in cache is the wrong tool, and how to avoid common traps with recursive and side-effecting functions.

Memoization is one of the simplest ways to trade memory for speed, but it only helps when the function and workload actually fit that trade. Applied well, it can eliminate repeated expensive computation. Applied casually, it can create stale results, unbounded memory growth, or misleading benchmarks.

Memoization: Caching the result of a function call so repeated calls with the same inputs can reuse prior work instead of recomputing it.

In Clojure, the main tool is memoize. It is intentionally small and opinionated: thread-safe, easy to use, and unbounded. That makes it powerful for the right cases and dangerous for the wrong ones.

When `memoize` Helps

Memoization is a strong fit when all of these are mostly true:

the function is pure or effectively pure
repeated calls with the same arguments are common
recomputation is expensive enough to matter
the input space is reasonably bounded
cached results can safely live as long as the memoized wrapper lives

Typical examples include:

dynamic programming helpers
expensive parsing of a small set of stable inputs
repeated derivation of metadata from immutable values
exploratory data processing where the same expensive lookup is performed many times

If one of those assumptions breaks, plain memoization becomes much less attractive.

What the Built-In Cache Actually Does

memoize wraps a function and stores results keyed by its arguments:

1(def expensive->report
2  (memoize
3    (fn [account-id]
4      (Thread/sleep 500)
5      {:account-id account-id
6       :score (* 10 account-id)})))

Design-wise, that means:

the cache is thread-safe
the cache is unbounded
there is no built-in expiration
there is no built-in invalidation hook
entries stay around for the lifetime of the memoized wrapper

That is perfect for some workloads and a bad fit for many service-layer caching problems.

The Recursive Example, Done Correctly

One common teaching example uses Fibonacci. But many introductions accidentally demonstrate the wrong memoization pattern.

This version looks reasonable but does not fix the recursive explosion:

1(defn slow-fib [n]
2  (if (< n 2)
3    n
4    (+ (slow-fib (- n 1))
5       (slow-fib (- n 2)))))
6
7(def fast-fib (memoize slow-fib))

The recursive calls inside slow-fib still call slow-fib, not fast-fib, so they bypass the cache.

The correct approach routes recursion back through the memoized var:

1(declare fib)
2
3(def fib
4  (memoize
5    (fn [n]
6      (if (< n 2)
7        n
8        (+ (fib (- n 1))
9           (fib (- n 2)))))))

That works because overlapping subproblems now reuse cached results instead of recomputing them.

What You Should Not Memoize

Plain memoization is the wrong tool for:

side-effecting functions such as logging, writes, or HTTP calls
results that change over time
very large results you cannot afford to retain indefinitely
functions keyed by a huge or effectively unbounded input space
logic whose correctness depends on mutable external state

For example, this is a bad fit:

1(def latest-profile
2  (memoize
3    (fn [user-id]
4      (http/get (str "https://api.example.com/users/" user-id)))))

Now cache policy, staleness, and network behavior are all hidden behind one function wrapper with no expiration story.

Scope Matters More Than Technique

Often the better question is not “Should I memoize this function?” but “At what scope should this value be cached?”

Possible scopes include:

inside one local computation
for the lifetime of a request
for the lifetime of a process
in a shared external cache

Plain memoize is usually strongest at the first two. Once you need TTLs, eviction, shared consistency, metrics, or explicit invalidation, you are designing a caching policy, not just wrapping a function.

A Practical Checklist

Before memoizing a function, ask:

Is the function referentially transparent for the inputs I care about?
Do repeated calls with the same inputs happen often enough?
Is the saved work larger than the lookup overhead?
Can I bound or at least reason about cache growth?
Is stale data acceptable for the lifetime of the wrapper?

If the answer to several of those is “no”, plain memoization is probably not the right design.

Decision Flow

    graph TD;
	    A["Function call"] --> B{"Pure and stable for same args?"}
	    B -->|No| C["Do not memoize directly"]
	    B -->|Yes| D{"Repeated calls with same args?"}
	    D -->|No| E["Cache adds little value"]
	    D -->|Yes| F{"Input space bounded enough?"}
	    F -->|No| G["Need bounded or expiring cache strategy"]
	    F -->|Yes| H["`memoize` is a good candidate"]

The diagram below is the right decision model: memoization is not a generic performance flag. It is a very specific bet on repeated pure work and acceptable cache lifetime.

Quiz

Loading quiz…

Revised on Wednesday, June 3, 2026

5.7 Recursion with `recur` and Tail Call Optimization

5.9 Namespaces and Modular Design