Memoization for Performance

Learn when Clojure memoization speeds up repeated work, when the built-in cache is the wrong tool, and how to avoid common traps with recursive and side-effecting functions.

Memoization is one of the simplest ways to trade memory for speed, but it only helps when the function and workload actually fit that trade. Applied well, it can eliminate repeated expensive computation. Applied casually, it can create stale results, unbounded memory growth, or misleading benchmarks.

Memoization: Caching the result of a function call so repeated calls with the same inputs can reuse prior work instead of recomputing it.

In Clojure, the main tool is memoize. It is intentionally small and opinionated: thread-safe, easy to use, and unbounded. That makes it powerful for the right cases and dangerous for the wrong ones.

When memoize Helps

Memoization is a strong fit when all of these are mostly true:

  • the function is pure or effectively pure
  • repeated calls with the same arguments are common
  • recomputation is expensive enough to matter
  • the input space is reasonably bounded
  • cached results can safely live as long as the memoized wrapper lives

Typical examples include:

  • dynamic programming helpers
  • expensive parsing of a small set of stable inputs
  • repeated derivation of metadata from immutable values
  • exploratory data processing where the same expensive lookup is performed many times

If one of those assumptions breaks, plain memoization becomes much less attractive.

What the Built-In Cache Actually Does

memoize wraps a function and stores results keyed by its arguments:

1(def expensive->report
2  (memoize
3    (fn [account-id]
4      (Thread/sleep 500)
5      {:account-id account-id
6       :score (* 10 account-id)})))

Design-wise, that means:

  • the cache is thread-safe
  • the cache is unbounded
  • there is no built-in expiration
  • there is no built-in invalidation hook
  • entries stay around for the lifetime of the memoized wrapper

That is perfect for some workloads and a bad fit for many service-layer caching problems.

The Recursive Example, Done Correctly

One common teaching example uses Fibonacci. But many introductions accidentally demonstrate the wrong memoization pattern.

This version looks reasonable but does not fix the recursive explosion:

1(defn slow-fib [n]
2  (if (< n 2)
3    n
4    (+ (slow-fib (- n 1))
5       (slow-fib (- n 2)))))
6
7(def fast-fib (memoize slow-fib))

The recursive calls inside slow-fib still call slow-fib, not fast-fib, so they bypass the cache.

The correct approach routes recursion back through the memoized var:

1(declare fib)
2
3(def fib
4  (memoize
5    (fn [n]
6      (if (< n 2)
7        n
8        (+ (fib (- n 1))
9           (fib (- n 2)))))))

That works because overlapping subproblems now reuse cached results instead of recomputing them.

What You Should Not Memoize

Plain memoization is the wrong tool for:

  • side-effecting functions such as logging, writes, or HTTP calls
  • results that change over time
  • very large results you cannot afford to retain indefinitely
  • functions keyed by a huge or effectively unbounded input space
  • logic whose correctness depends on mutable external state

For example, this is a bad fit:

1(def latest-profile
2  (memoize
3    (fn [user-id]
4      (http/get (str "https://api.example.com/users/" user-id)))))

Now cache policy, staleness, and network behavior are all hidden behind one function wrapper with no expiration story.

Scope Matters More Than Technique

Often the better question is not “Should I memoize this function?” but “At what scope should this value be cached?”

Possible scopes include:

  • inside one local computation
  • for the lifetime of a request
  • for the lifetime of a process
  • in a shared external cache

Plain memoize is usually strongest at the first two. Once you need TTLs, eviction, shared consistency, metrics, or explicit invalidation, you are designing a caching policy, not just wrapping a function.

A Practical Checklist

Before memoizing a function, ask:

  1. Is the function referentially transparent for the inputs I care about?
  2. Do repeated calls with the same inputs happen often enough?
  3. Is the saved work larger than the lookup overhead?
  4. Can I bound or at least reason about cache growth?
  5. Is stale data acceptable for the lifetime of the wrapper?

If the answer to several of those is “no”, plain memoization is probably not the right design.

Decision Flow

    graph TD;
	    A["Function call"] --> B{"Pure and stable for same args?"}
	    B -->|No| C["Do not memoize directly"]
	    B -->|Yes| D{"Repeated calls with same args?"}
	    D -->|No| E["Cache adds little value"]
	    D -->|Yes| F{"Input space bounded enough?"}
	    F -->|No| G["Need bounded or expiring cache strategy"]
	    F -->|Yes| H["`memoize` is a good candidate"]

The diagram below is the right decision model: memoization is not a generic performance flag. It is a very specific bet on repeated pure work and acceptable cache lifetime.

Quiz

Loading quiz…
Revised on Thursday, April 23, 2026