Learn what reducers are actually for, when fold-based parallel reduction helps, and why reducers are best for large CPU-bound associative workloads rather than general-purpose pipeline code.
Reducer: A reducible transformation pipeline designed to make reduction more efficient, including parallel reduction with
foldon suitable collections.
Reducers are not the default answer to collection processing in modern Clojure. For most ordinary transformation pipelines, sequences or transducers are the clearer tool. Reducers matter when you want efficient reduction, especially parallel reduction, over large foldable collections and the combine step is associative.
That narrower definition is important because reducers are often taught too broadly.
Reducers shine when all of these are true:
Vectors are a common fit because they can be partitioned efficiently for folding.
1(require '[clojure.core.reducers :as r])
2
3(def data (vec (range 1 1000000)))
4
5(defn sum-of-squares [xs]
6 (r/fold + (r/map #(* % %) xs)))
r/fold partitions work, processes partitions in parallel, then combines partial results. That is different from lazy sequences, which are fundamentally sequential in evaluation order.
These three tools are related but not interchangeable:
If your first instinct is “I need parallel map,” ask a better question: is the work CPU-bound, is the collection big, and can the final result be combined associatively? If the answer is no, reducers are probably the wrong tool.
Reducers are a poor fit for:
pmap or explicit worker channels communicate intent more clearlyParallel overhead is real. If the workload is small, reducers can be slower than straightforward sequence code.
foldA good fold has:
This works well:
1(defn histogram [xs]
2 (r/fold
3 (fn
4 ([] {})
5 ([a b] (merge-with + a b)))
6 (r/map #(hash-map (mod % 10) 1) xs)))
The combine step is associative: two partial maps can be merged in any grouping.
Use reducers when the performance goal is specifically efficient reduction over a big in-memory collection. If the design problem is streaming, I/O coordination, or general asynchronous work, reducers are the wrong abstraction. In those cases, channels, futures, or transducers usually express the design better.