Learn when `pmap` helps, when `future` is the better tool, and why parallelism only pays off for the right workload shape in Clojure.
future: One asynchronous computation whose result will be available later.pmap: A parallel lazy mapping tool for independent, usually CPU-bound work across a collection.
future and pmap are both simple, but they solve different problems. Confusing them leads to the two classic mistakes:
future to build a whole data-processing topologypmap on work that is too small, too lazy, or too effect-heavy to benefitfuture Is a One-Task Toolfuture is a good fit when one unit of work can be launched and joined later:
1(def parsed
2 (future
3 (parse-large-file path)))
4
5;; later
6@parsed
This is useful for:
It is not a job scheduler, queue, or stream processor. Once you start building coordination around many futures, you usually need a more explicit model.
pmap Is Collection Parallelismpmap is best when:
1(def scores
2 (doall
3 (pmap score-document docs)))
The doall matters because pmap is lazy. If you do not realize it, you may think work is parallelized while actually forcing results slowly at the consumer edge.
pmap Often Disappointspmap is frequently slower than expected because:
pmap is not a magic speedup switch. It works when the workload is large enough to amortize coordination cost.
Use pmap for independent CPU-bound transformations. For blocking I/O, it is usually better to use:
futurethreadpipeline-blockingBlocking work has different saturation behavior from CPU-bound work, so the execution model should reflect that.
Because pmap is lazy, downstream realization determines when work is actually forced. That has two practical consequences:
This is why doall, dorun, or an explicit downstream collection boundary often appears in correct pmap usage. The laziness is not wrong, but it changes how parallelism shows up in the program.
futureChoose future when:
Choose pmap when:
The common mistakes are:
pmap on tiny units of work where coordination cost dominatespmappmap preserves collection order, which may force a slower shape than an unordered worker designIf the workload is not naturally an expensive parallel map, another abstraction usually explains the system better.
pmap often surprises people because the parallel work is still tied to consumption. If downstream code realizes only a few elements at a time, the apparent “parallel map” behaves more like a lazily paced computation than a fully active worker farm.
That means review should ask:
doall is actually intendedUse future for one background computation. Use pmap for independent, expensive collection transformations. If the workload needs queues, buffering, cancellation policy, or blocking-stage control, you have moved beyond both and should use a richer concurrency model.