How to deploy machine learning models from Clojure with clear inference boundaries, container builds, and operational practices that survive production traffic.
Deploying ML models in Clojure applications is mostly an operational design problem. The difficult part is rarely “how do I call a model?” The harder questions are:
That is why a good deployment lesson should treat the model as part of a serving system, not as a magical function hidden behind one HTTP route.
There are two common shapes.
The model loads inside the same Clojure service that handles business logic.
This works best when:
Advantages:
Risks:
The Clojure system calls a dedicated model-serving service or internal inference component.
This works best when:
Advantages:
Risks:
Older examples often spend too much time on which routing library serves the endpoint. That matters less than:
Any Clojure HTTP stack can expose inference. The stronger design question is whether the service contract makes model behavior traceable and operable.
A minimal embedded example can stay plain:
1(ns myapp.inference
2 (:require [ring.adapter.jetty :refer [run-jetty]]
3 [ring.util.response :refer [response]]
4 [clojure.data.json :as json]))
5
6(defonce model-state (atom {:version "2026-03-01"
7 :loaded? true}))
8
9(defn predict [input]
10 {:model-version (:version @model-state)
11 :prediction "placeholder"
12 :input-size (count input)})
13
14(defn handler [request]
15 (let [payload (json/read-str (slurp (:body request)) :key-fn keyword)]
16 (-> (predict payload)
17 json/write-str
18 response)))
19
20(defn start! []
21 (run-jetty handler {:port 3000 :join? false}))
The important thing here is not Ring itself. It is that the response carries a model version and that the service has an explicit loaded-model state.
A production inference system should usually expose:
Without that, incident response becomes guesswork. If predictions suddenly drift or latency spikes, the team needs to know:
For a modern Clojure application, container builds should not assume an old lein uberjar workflow by default. A current pattern is:
deps.edn for dependenciestools.build for jar or uberjar creation 1FROM eclipse-temurin:21-jdk AS build
2WORKDIR /app
3COPY . .
4RUN ./clojure -T:build uber
5
6FROM eclipse-temurin:21-jre
7WORKDIR /app
8COPY --from=build /app/target/app.jar /app/app.jar
9EXPOSE 3000
10ENTRYPOINT ["java", "-jar", "/app/app.jar"]
The exact image choice depends on your runtime constraints. The important point is that the build pipeline reflects the current Clojure toolchain and produces one explicit artifact.
Kubernetes becomes useful when you need:
A simple Deployment and Service model is still a good baseline:
1apiVersion: apps/v1
2kind: Deployment
3metadata:
4 name: ml-inference
5spec:
6 replicas: 3
7 selector:
8 matchLabels:
9 app: ml-inference
10 template:
11 metadata:
12 labels:
13 app: ml-inference
14 spec:
15 containers:
16 - name: ml-inference
17 image: registry.example.com/ml-inference:2026-03-28
18 ports:
19 - containerPort: 3000
20---
21apiVersion: v1
22kind: Service
23metadata:
24 name: ml-inference
25spec:
26 selector:
27 app: ml-inference
28 ports:
29 - port: 80
30 targetPort: 3000
This is not interesting because it is Kubernetes. It is interesting because it gives the team explicit rollout, scaling, and traffic primitives.
The service should tell you:
If the model is external, also measure:
That is what turns a model endpoint into a production service rather than a lab demo.
flowchart LR
A["Client or Upstream Service"] --> B["Inference API"]
B --> C["Model Version + Feature Validation"]
C --> D["Prediction"]
D --> E["Response with Model Metadata"]
B --> F["Metrics, Traces, and Drift Signals"]
The key thing to notice is that the model is only one step in the path. Validation, versioning, and observability are part of the serving system.
lein uberjar.