The Joy of Clojure

}

Of course, this describes implementation details and shouldn’t be considered fact in

future version of Clojure. In fact, as shown before, you were able to add implementa-

tions for the parts of the DynaFrame class at the REPL because Clojure generates a stub

that looks up concrete implementations through Vars. But these details are useful for

describing the logical product of :gen-class and compile. The :gen-class directive

with the argument :name joy.gui.DynaFrame creates a class vaguely resembling the

following Java source:

package joy.gui;

public class DynaFrame extends javax.swing.JFrame {

public final Object state;

Download from Wow! eBook <www.wowebook.com>

Clojure gen-class and GUI programming

215

public DynaFrame(String title) {

Object r = clojure.lang.RT.var("joy.gui.DynaFrame", "df-init")

.invoke(title);

Object cargs = clojure.lang.RT.nth(r, 0);

state = clojure.lang.RT.nth(r, 1);

super((String) clojure.lang.RT.nth(cargs, 0));

}

public static String version() { return "1.0"; }

// Delegate to the display function var

public void display(Object the_this, java.awt.Container c) {

return clojure.lang.RT.var("joy.gui.DynaFrame", "df-display")

.invoke(the_this, c);

}

. . .

}

The :gen-class directive creates a class that’s a delegate for the Vars (prefixed as spec-

ified with df-) located in the corresponding namespace, contains the state, and also

holds any static methods. This is a lot of detail to contend with, but understanding it’s

important when arranging your Clojure projects to take advantage of code compilation.

One final important point when using gen-class is the semantics surrounding the

:impl-ns directive. Our example relies on the fact that the gen-class namespace is

the same as the implementation namespace (the :impl-ns ), meaning that the compi-

lation will transitively compile all of the implementation functions. On the other

hand, when your implementation and gen-class namespaces are distinct, you no lon-

ger suffer transitive compilation. This provides the benefit of allowing a mixture of

compiled (class files) and uncompiled (.clj files) Clojure products.

10.2.2 Exploring user interface design and development with Clojure

Before we begin, we’ll devise a simple model (_why 20073) for exploring user inter-

face design. We don’t have to complicate matters, because the goal is only to get a gen-

eral idea of how Clojure makes a typically painful task like Java GUI development a joy.

To achieve this modest goal, we’ll need some simple containers illustrated in figure

10.4: shelves, stacks, and splitters.

Because DynaFrame requires a java.awt.

shelf

stack

splitter

Container as its displayed element, we’ll make

each container a derivative thereof. This allows

the containers to nest, helping to build richer

GUIs. Finally, their forms should mirror their

graphical layout, within reason. These three

containers are implemented in the following

Figure 10.4 Basic GUI containers: using

only a handful of rudimentary containers,

listing.

we can build neato GUI prototypes.

3 Our GUI model in this section is based loosely on the Ruby framework Shoes created by _why. Thank you sir,

wherever you are.

Download from Wow! eBook <www.wowebook.com>

216

CHAPTER 10 Java.next

Listing 10.4 Simple GUI containers

(ns joy.gui.socks

(:import

(joy.gui DynaFrame)

(javax.swing Box BoxLayout JTextField JPanel

JSplitPane JLabel JButton

JOptionPane)

(java.awt BorderLayout Component GridLayout FlowLayout)

(java.awt.event ActionListener)))

(defn shelf [& components]

(let [shelf (JPanel.)]

(.setLayout shelf (FlowLayout.))

(doseq [c components] (.add shelf c))

shelf))

(defn stack [& components]

(let [stack (Box. BoxLayout/PAGE_AXIS)]

(doseq [c components]

(.setAlignmentX c Component/CENTER_ALIGNMENT)

(.add stack c))

stack))

(defn splitter [top bottom]

(doto (JSplitPane.)

(.setOrientation JSplitPane/VERTICAL_SPLIT)

(.setLeftComponent top)

(.setRightComponent bottom)))

These simple GUI elements are built on top of the Java Swing library, where each sub-

widget in the components argument is added to the properly configured Container-

derived parent. These are good as a starting point, but still there’s nothing to display

unless we dive into the Swing API directly. We can do one better than that by providing

a simple base set of widgets: buttons, labels, and text boxes.

Listing 10.5 A set of simple widgets

(defn button [text f]

(doto (JButton. text)

(.addActionListener

(proxy [ActionListener] []

(actionPerformed [_] (f))))))

(defn txt [cols t]

(doto (JTextField.)

(.setColumns cols)

(.setText t)))

(defn label [txt] (JLabel. txt))

The button element takes a function executed on a mouse-click, so we’ll now provide

a JavaScript-like alert function as a simple action:

(defn alert

([msg] (alert nil msg))

([frame msg]

(javax.swing.JOptionPane/showMessageDialog frame msg)))

Download from Wow! eBook <www.wowebook.com>

Clojure gen-class and GUI programming

217

Figure 10.5 DynaFrame alerts: we

Figure 10.6 A much more elaborate DynaFrame GUI:

can create slightly more complex

there’s no limit to the complexity of this simple GUI

GUIs and attach actions on the fly.

model. Go ahead and experiment to your heart’s content.

Having built all of these GUI elements, we’ll describe the first simple GUI as shown in

figure 10.5.

It seems simple, if not pointless. But you might be pleasantly surprised with the con-

cise code used to describe it:

(.display gui

(splitter

(button "Procrastinate" #(alert "Eat Cheetos"))genclass

(button "Move It" #(alert "Couch to 5k"))))

These widgets are adequate enough to create richer user interfaces, and to illustrate

we’ll add one more widget builder for grid-like elements:

(defn grid [x y f]

(let [g (doto (JPanel.)

(.setLayout (GridLayout. x y)))]

(dotimes [i x]

(dotimes [j y]

(.add g (f))))

g))

With a small amount of code, we can build the richer user interface in figure 10.6.

Listing 10.6 A more complex GUI example

(.display gui

(let [g1 (txt 10 "Charlemagne")

g2 (txt 10 "Pippin")

r (txt 3 "10")

d (txt 3 "5")]

(splitter

(stack

(shelf (label "Player 1") g1)

(shelf (label "Player 2") g2)

(shelf (label "Rounds ") r

(label "Delay ") d))

Download from Wow! eBook <www.wowebook.com>

218

CHAPTER 10 Java.next

(stack

(grid 21 11 #(label "-"))

(button "Go!" #(alert (str (.getText g1) " vs. "

(.getText g2) " for "

(.getText r) " rounds, every "

(.getText d) " seconds.")))))))

Though not perfect, it gives you a good idea how to extend these functions to provide

a finer level of control over layout and positioning, as well as ways to provide more

functionality to create richer interfaces. How would you go about creating an agile

environment for incremental GUI development using plain Java? Clojure allows you to

start with a powerful set of primitives and incrementally refine them until they suit

your exact needs.

Though this section started as a description of creating a simple dynamic frame

using the gen-class facility, we felt it was worthwhile to expand into the realm of

dynamic, incremental development. There are times when AOT compilation is abso-

lutely necessary (such as client requirements), but our advice is to avoid it if at all pos-

sible. Instead, leverage the dynamic nature of Clojure to its fullest, designing your

system to fit into that model.

10.3 Clojure’s relationship to Java arrays

In general, the need to delve into arrays should be limited, but such casual dismissal

isn’t always apropos. In this section, we’ll cover some of the uses for Java arrays in Clo-

jure, including but not limited to arrays as multimethod dispatch, primitive versus ref-

erence arrays, calling variadic functions and constructors, and multi-dimensional

arrays.

10.3.1 Types of arrays: primitive and reference

As mentioned in section 4.1, Clojure numbers are of the boxed variety, but in many

cases the Clojure compiler can resolve the correct call for primitive interoperability

calls. But it can never resolve the need to pass a primitive array when a reference array

is provided instead.

CREATING PRIMITIVE ARRAYS

The Java class java.lang.StringBuilder provides4 a method .append(char[]) that

appends the primitive chars in the passed array to its end. But our first instinct for

making this happen in Clojure won’t bear fruit:

(doto (StringBuilder. "abc")

(.append (into-array [\x \y \z])))

;=> #<StringBuilder abc[Ljava.lang.Character;@65efb4be>

The problem lies in that Clojure’s into-array function doesn’t return a primitive

array of char[], but instead a reference array of Character[], forcing the Clojure

4 When dealing with and manipulating strings, your best options can almost always be found in the core

clojure.string namespace or the clojure.contrib.string namespace in the Clojure contrib library.

Download from Wow! eBook <www.wowebook.com>

Clojure’s relationship to Java arrays

219

compiler to resolve the call as to the StringBuilder.append(Object) method

instead. That the Array class is a subclass of Object is a constant cause for headache in

Java and clearly can be a problem5 for Clojure as well. What we really want to do is

ensure that a primitive array is used as the argument to .append, which we do here:

(doto (StringBuilder. "abc")

(.append (char-array [\x \y \z])))

;=> #<StringBuilder abcxyz>

Clojure provides a number of primitive array-building functions that work similarly to

char-array, as summarized in the following list.

 boolean-array

 double-array

 long-array

 byte-array

 float-array

 object-array

 char-array

 int-array

 short-array

You could also use the make-array and into-array functions to create primitive

arrays:

(let [ary (make-array Integer/TYPE 3 3)]

(dotimes [i 3]

(dotimes [j 3]

(aset ary i j (+ i j))))

(map seq ary))

;=> ((0 1 2) (1 2 3) (2 3 4))

(into-array Integer/TYPE [1 2 3])

;=> #<int[] [I@391be9d4>

Populating arrays can often be an iterative affair, as seen in the previous snippet, but

there are often more concise ways to do so when creating reference arrays.

CREATING REFERENCE ARRAYS

To intentionally create an array of a particular reference type, or of compatible types,

use the into-array function, passing in a sequence of objects:

(into-array ["a" "b" "c"])

;=> #<String[] [Ljava.lang.String;@3c3ac93e>

(into-array [(java.util.Date.) (java.sql.Time. 0)])

;=> #<Date[] [Ljava.util.Date;@178aab40>

(into-array ["a" "b" 1M])

; java.lang.IllegalArgumentException: array element type mismatch

(into-array Number [1 2.0 3M 4/5])

;=> #<Number[] [Ljava.lang.Number;@140b6e46>

The function into-array determines the type of the resulting array based on the first

element of the sequence, and each subsequent element type must be compatible (a

In this example, it’s preferred that a “java.lang.IllegalArgumentException: No matching method found”

exception be thrown, because StringBuilder doesn’t have a method matching .append(Character[]) or

even .append(Object[]).

Download from Wow! eBook <www.wowebook.com>

220

CHAPTER 10 Java.next

subclass). To create a heterogeneous array of java.lang.Object, use the to-array or

to-array-2d function:

(to-array-2d [[1 2 3]

[4 5 6]])

;=> #<Object[][] [[Ljava.lang.Object;@bdccedd>

(to-array ["a" 1M #(%) (proxy [Object] [])])

;=> #<Object[] [Ljava.lang.Object;@18987a33>

(to-array [1 (int 2)])

;=> #<Object[] [Ljava.lang.Object;@6ad3c65d>

Be wary: primitives will be autoboxed when using either to-array or to-array-2d.

10.3.2 Array mutability

Because JVM arrays are mutable, you need to be aware that their contents can change

at any point. For example:

(def ary (into-array [1 2 3]))

(def sary (seq ary))

sary

;=> (1 2 3)

What happens to sary if we change the contents of ary?

(aset ary 0 42)

sary

;=> (42 2 3)

The seq view of an array is that of the live array and therefore subject to concurrent

modification. Be cautious when sharing arrays from one function to the next, and

especially across threads. Note that this can be especially disastrous should an array

change in the middle of a sequential operation, such as the use of the higher-order

array functions amap and areduce, as might be used to define a sum-of-squares func-

tion6 for arrays:

(defn asum-sq [xs]

(let [dbl (amap xs i ret

(* (aget xs i)

(aget xs i)))]

(areduce dbl i ret 0

(+ ret (aget dbl i)))))

(asum-sq (float-array [1 2 3 4 5]))

;=> 55.0

At any point during the processing of asum-sq, the underlying array could change,

causing inaccurate results or worse. You should take great care when using Java’s

mutable arrays, though sharing only the seq of an array is perfectly safe because

there’s no way to get at the array when you only have a reference to the seq.

6 This function is fairly clear but slower than it should be. We’ll make it faster in sections 12.1 and 12.5.

Download from Wow! eBook <www.wowebook.com>

Clojure’s relationship to Java arrays

221

10.3.3 That unfortunate naming convention

You might’ve noticed (how could you miss?) the ugly names printed by the Clojure

REPL whenever an array is evaluated. There’s logic to this madness, as part of the jum-

ble is the legal name of the class corresponding to the array—the part formed as

[Ljava.lang.String;. For example, the previous name corresponded to a 1D array

of strings. The representation for a 2D array of strings is then [[Ljava.lang.String;,

and it therefore follows that [[[Ljava.lang.String; is a 3D array of strings. Are you

sensing a pattern here? Table 10.1 lays it out.

Using what you know about arrays, the class representation names can be used to

do things such as multimethod dispatch:

(what-is (into-array ["a" "b"]))

;=> "1d String"

(what-is (to-array-2d [[1 2][3 4]]))

;=> "2d Object"

(what-is (make-array Integer/TYPE 2 2 2 2))

;=> "Primitive 4d int"

You can create methods for identifying arrays and returning a descriptive string using

the <indexterm><primary>java.lang.Class/forName</primary></indexterm>Class/

forName method as shown:

(defmulti what-is class)

(defmethod what-is (Class/forName "[Ljava.lang.String;") [a] "1d String")

(defmethod what-is (Class/forName "[[Ljava.lang.Object;") [a] "2d Object")

(defmethod what-is (Class/forName "[[[[I") [a] "Primitive 4d int")

Though not the most beautiful task to perform in Clojure, it’s easy to understand

once you’ve grasped how the array class names are constructed.

Representation

Array type

[Ljava.lang.Object;

Reference array

Primitive byte array

Primitive int array

Primitive char array

Primitive short array

Primitive float array

Primitive double array

Primitive long array

Primitive boolean array

Representation

Dimension

[

[[

Table 10.1 Array type class

...

and so on...

names and dimensions

Download from Wow! eBook <www.wowebook.com>

222

CHAPTER 10 Java.next

10.3.4 Multidimensional arrays

Observe what happens when the following call is tried:

(what-is (into-array [[1.0] [2.0]]))

; java.lang.IllegalArgumentException: No method in multimethod

; 'what-is' for dispatch value: class [Lclojure.lang.PersistentVector;

The problem is that the into-array function builds a 1D array of persistent vectors,

but we wanted a 2D array of doubles. In order to do this, the array would have to be

built differently:

(defmethod what-is (Class/forName "[[D") [a] "Primitive 2d double")

(defmethod what-is (Class/forName "[Lclojure.lang.PersistentVector;") [a]

"1d Persistent Vector")

(what-is (into-array (map double-array [[1.0] [2.0]])))

;=> "Primitive 2d double"

(what-is (into-array [[1.0] [2.0]]))

;=> "1d Persistent Vector"

We had to use the map function with double-array on the inner arrays in order to

build the properly typed outer array. When working with multidimensional arrays, be

sure that you know what your inner elements should be on creation and create them

accordingly.

10.3.5 Variadic method/constructor calls

There’s no such thing as a variadic constructor or method at the bytecode level,

although Java provides syntactic sugar at the language level. Instead, variadic methods

expect an array as their final argument, and this is how they should be accessed in Clo-

jure interop scenarios. Take, for example, the call to the String/format function:

(String/format "An int %d and a String %s" (to-array [99, "luftballons"]))

;=> "An int 99 and a String luftballons"

That covers most of the high points regarding arrays in Clojure interoperability. We’ll

touch on them briefly when we talk about performance considerations in chapter 12,

but for now we’ll move on to a more interesting topic: the interoperability underpin-

nings relating to Clojure’s implementation.

10.4

All Clojure functions implement...

Clojure functions are highly amenable to interoperability. Their underlying classes

implement a number of useful interfaces that you can investigate by running

(ancestors (class #())). Most of the resulting classes are only applicable to the

internals of Clojure itself, but a few interfaces are useful in interop scenarios:

java.util.concurrent.Callable, java.util.Comparator, and java.lang.Runnable.

In this section, we’ll talk briefly about each and also provide simple examples.

Download from Wow! eBook <www.wowebook.com>

All Clojure functions implement...

223

10.4.1 java.util.Comparator

Simply put, the java.util.Comparator interface defines the signature for a single

method .compare that takes two objects l and r and returns -1 if l < r, 0 if l == r, and

> 0 if l > r. The static Java method Collections/sort provides an implementation

that takes a derivative of java.util.List and a Comparator and destructively sorts the

list provided. Using this knowledge, we can provide some basic infrastructure for the

remainder of this subsection:

(import '[java.util Comparator Collections ArrayList])

(defn gimme [] (ArrayList. [1 3 4 8 2]))

(doto (gimme)

(Collections/sort (Collections/reverseOrder)))

;=> #<ArrayList [8, 4, 3, 2, 1]>

In order to write a simple comparator that provides a reverse-sort Comparator, we

might naively do so:

(doto (gimme)

(Collections/sort

(reify Comparator

(compare [this l r]

(cond

(> l r) -1

(= l r) 0

:else 1)))))

;=> #<ArrayList [8, 4, 3, 2, 1]>

Though this works, Clojure provides a better way by allowing the use of a function as

the Comparator directly. You can couple this knowledge with the fact that Clojure

already provides numerous functions useful for comparison, as shown next.

Listing 10.7 Useful comparison functions

(doto (gimme) (Collections/sort #(compare %2 %1)))

compare function

;=> #<ArrayList [8, 4, 3, 2, 1]>

(doto (gimme) (Collections/sort >))

Less-than function

;=> #<ArrayList [8, 4, 3, 2, 1]>

(doto (gimme) (Collections/sort <))

Greater-than function

;=> #<ArrayList [1, 2, 3, 4, 8]>

(doto (gimme) (Collections/sort (complement <)))

complement function

;=> #<ArrayList [8, 4, 3, 2, 1]>

When presented with numerous possible implementation strategies, often the best

one in Clojure is the simplest.

10.4.2 java.lang.Runnable

Java threads expect an object implementing the java.lang.Runnable interface meant

for computations returning no value. We won’t get into the specifics of threaded

Download from Wow! eBook <www.wowebook.com>

224

CHAPTER 10 Java.next

computation until the next chapter, but the next two examples are simple enough to

require little a priori knowledge on the matter. If you wish to pass a function to

another Java thread, it’s as simple as providing it as an argument to the Thread

constructor:

(doto (Thread. #(do (Thread/sleep 5000)

(println "haikeeba!")))

.start)

; => #<Thread Thread[Thread-3,5,main]>

; ... 5 seconds later

; haikeeba!

This scenario is unlikely to occur often, because Clojure’s core concurrency features

are often sufficient for most needs. But that’s not always the case, and therefore it’s nice

to know that raw Clojure functions can be used seamlessly in the JVM’s concurrency API.

10.4.3 java.util.concurrent.Callable

The Java interface java.util.concurrent.Callable is specifically meant to be used

in a threaded context for computations returning a value. You can use a Clojure func-

tion using Java’s java.util.concurrentFutureTask class representing a “computa-

tion to occur later”:

(import '[java.util.concurrent FutureTask])

(let [f (FutureTask. #(do (Thread/sleep 5000) 42))]

(.start (Thread. #(.run f)))

(.get f))

; ... 5 seconds later

;=> 42

The call to FutureTask.get as the last expression will stop execution (a behavior

known as blocking) until the function passed to the constructor completes. Because the

function in question sleeps for 5 seconds, the call to .get must wait.

Clojure’s interoperability mechanisms are a two-way street. Not only do they allow

Java APIs to work seamlessly within Clojure, but they also provide ways for Clojure

functions to work in Java APIs. In the next section, we’ll continue on this theme of

bidirectional interop with a discussion on the ways that Clojure’s collection types can

also be used in traditional Java APIs.

10.5

Using Clojure data structures in Java APIs

Clojure functions are ready to use in many Java APIs, and as it turns out, so are its col-

lection types. Just as the Clojure collections are separated along three distinct equality

partitions7 (maps, sequences, and sets), so too are its levels of Java collection interop-

erability support. The Java Collections Framework has a nice high-level design philos-

ophy centered around working against interfaces. These interfaces are additionally

cognizant of immutability, in that the mutable parts are optional and the immutable

7 A refresher on equality partitions can be found in section 5.1.2 and throughout the remainder of chapter 5.

Download from Wow! eBook <www.wowebook.com>

Using Clojure data structures in Java APIs

225

parts are clearly demarcated. In this section, we’ll give a brief rundown of possible

ways that Clojure collections can be used within traditional Java APIs adhering to the

immutable collection protocols.

10.5.1 java.util.List

Clojure sequential collections conform to the immutable parts of the java.util.List

interface, which in turn extends the java.util.Collection and java.lang.Iterable

interfaces. You can see this conformance in action in the following listing.

Listing 10.8 java.util.List conformance for sequences and seqs

(.get '[a b c] 1)

Vectors

;=> b

(.get (repeat :a) 138)

Lazy seqs

;=> :a

(.containsAll '[a b c] '[b c])

Vectors also collections

;=> true

(.add '[a b c] 'd)

Sequences not mutable

; java.lang.UnsupportedOperationException

That Clojure sequences and seqs don’t provide the mutable API of typical Java collec-

tions is obvious. But the implications are that you can’t use them in all Java APIs, such

as you might attempt when requiring that a vector be sorted destructively with a Java

API call:

(java.util.Collections/sort [3 4 2 1])

; java.lang.UnsupportedOperationException

A better approach is to either use the method used in the previous section using a Clo-

jure function, or even better to use the Clojure’s sort function instead.

10.5.2 java.lang.Comparable

The interface java.lang.Comparable is the cousin of the Comparator interface.

Comparator refers to objects that can compare two other objects, whereas Comparable

refers to an object that can compare itself to another object:

(.compareTo [:a] [:a])

;=> 0

(.compareTo [:a :b] [:a])

;=> 1

(.compareTo [:a :b] [:a :b :c])

;=> -1

(sort [[:a :b :c] [:a] [:a :b]])

;=> ([:a] [:a :b] [:a :b :c])

One thing to note is that Clojure’s vector implementation is currently the only collec-

tion type that implements the java.lang.Comparable interface providing the

Download from Wow! eBook <www.wowebook.com>

226

CHAPTER 10 Java.next

.compareTo method. As a result, attempting to compare a different collection type to

a vector leads to a confusing error message:

(.compareTo [1 2 3] '(1 2 3))

; java.lang.ClassCastException: clojure.lang.PersistentList

; cannot be cast to clojure.lang.IPersistentVector

Pay no attention to that class-cast exception behind the curtain.

10.5.3 java.util.RandomAccess

In general, the java.util.RandomAccess interface is used to indicate that the data

type provides constant time indexed access to its elements. This allows for algorithms

to follow optimized paths accordingly. This optimization is generally performed by

using the .get method for access rather than an iterator:

(.get '[a b c] 2)

;=> c

Vectors are currently the only Clojure collection type that can make such guarantees.

10.5.4 java.util.Collection

The java.util.Collection interface lies at the heart of the Java Collections Frame-

work, and classes implementing it can play in many of Java’s core collections APIs. A

useful idiom taking advantage of this fact is the use of a Clojure sequence as a model

to build a mutable sequence for use in the Java Collections API, as shown:

(defn shuffle [coll]

(seq (doto (java.util.ArrayList. coll)

java.util.Collections/shuffle)))

(shuffle (range 10))

;=> (3 9 2 5 4 7 8 6 1 0)

It’s difficult to write a proper sequence-shuffling function, so the shuffle function

takes full advantage of an existing Java API that has been tested and used extensively

for years. As an added bonus, shuffle is mostly8 functional, idiomatic, and fast. Clo-

jure favors immutability but doesn’t trap you into it when there are practical solutions

to be leveraged.

JAVA.UTIL.MAP

Like most of the Clojure collections, its maps are analogous to Java maps in that they

can be used in nonmutating contexts. But immutable maps have the added advantage

of never requiring defensive copies and will act exactly the same as unmodifiable Java

maps:

(java.util.Collections/unmodifiableMap

(doto (java.util.HashMap.) (.put :a 1)))

;=> #<UnmodifiableMap {:a=1}>

8 shuffle isn’t referentially transparent. Can you see why?

Download from Wow! eBook <www.wowebook.com>

definterface

227

(into {} (doto (java.util.HashMap.) (.put :a 1)))

;=> {:a 1}

In both cases, any attempt to modify the map entry classes of the maps will throw an

exception.

10.5.5 java.util.Set

In the case of Java and Clojure sets, the use of mutable objects9 as elements is highly

frowned upon:

(def x (java.awt.Point. 0 0))

(def y (java.awt.Point. 0 42))

(def points #{x y})

points

;=> #{#<Point java.awt.Point[x=0,y=0]> #<Point java.awt.Point[x=0,y=42]>}

Everything looks peachy at this point, but introducing mutability into the equation

has devastating costs:

(.setLocation y 0 0)

points

;=> #{#<Point java.awt.Point[x=0,y=0]> #<Point java.awt.Point[x=0,y=0]>}

Oh boy. Not only have we confused the set points by modifying its entries out from

underneath it, but we’ve also circumvented Clojure’s value-based semantics and the

nature of set-ness. Dealing with mutable objects is extremely difficult to reason about,

especially when dealing with collections of them. The gates of a mutable class are wide

open, and at any point during the execution of your programs this fact can be

exploited, willingly or not. But you can’t always avoid dealing with mutable nasties in

Clojure code because of a strict adherence to fostering interoperability.

We’ve covered the two-way interop for functions and now collection types, but we

have one final path to traverse: the use and benefits of Clojure’s definterface macro.

10.6

definterface

As we mentioned in section 9.3, Clojure was built on abstractions in the host platform

Java. Types and protocols help to provide a foundation for defining your own abstrac-

tions in Clojure itself, for use within a Clojure context. But when interoperating with

Java code, protocols and types won’t always suffice. Therefore, you need to be able to

generate interfaces in some interop scenarios, and also for performance in cases

involving primitive argument and return types. In this section, we’ll talk briefly about

generating Java interfaces as the syntax, use cases, and purposes are likely familiar.

10.6.1 Generating interfaces on the fly

When you AOT-compile a protocol, you generate a public interface by the same name,

with the methods defined. The code in listing 10.9 uses definterface to define an

9 Clojure’s mutable reference types used to represent a logical identity are perfectly safe to use in sets. We’ll

explore the reference types in exquisite detail in the next chapter.

Download from Wow! eBook <www.wowebook.com>

228

CHAPTER 10 Java.next

interface ISliceable. This interface is used to define an abstract thing that has the

ability to be sliced using a method slice, which takes start and end indices of type int.

Likewise, the interface defines a method sliceCount that returns an int representing

the number of possible slices.

Listing 10.9 An interface defining a sliceable object

(definterface ISliceable

(slice [^int s ^int e])

(^int sliceCount []))

;=> user.ISliceable

You’ll notice the inclusion of the type decoration ^int on the arguments to slice and

the return type of sliceCount. For now you can assume that they operate the same as

a type declaration in most languages providing them. They look similar to type hints

discussed in section 12.1, except that only in definterface are primitive hints sup-

ported. Now we can create an instance implementing the user.ISliceable interface,

as shown next.

Listing 10.10 A dummy reified ISliceable

(def dumb

(reify user.ISliceable

(slice [_ s e] [:empty])

(sliceCount [_] 42)))

(.slice dumb 1 2)

;=> [:empty]

(.sliceCount dumb)

;=> 42

There’s nothing terribly surprising about dumb, but you can instead implement it via

deftype, proxy, gen-class, or even a Java class. Note that definterface works even

without AOT compilation.

We can now take definterface to the next logical step and extend the ISliceable

interface to other types using a well-placed protocol.

Listing 10.11 Using a protocol to extend ISliceable

(defprotocol Sliceable

(slice [this s e])

(sliceCount [this]))

(extend user.ISliceable

Sliceable

{:slice (fn [this s e] (.slice this s e))

:sliceCount (fn [this] (.sliceCount this))})

(sliceCount dumb)

;=> 42

(slice dumb 0 0)

;=> [:empty]

Download from Wow! eBook <www.wowebook.com>

Be wary of exceptions

229

By extending the ISliceable interface along Sliceable, ISliceable is able to partic-

ipate in the protocol, meaning that you have the possibility for extending other types,

even final types such as String, as shown next.

Listing 10.12 Extending strings along the Sliceable protocol

(defn calc-slice-count [thing]

"Calculates the number of possible slices using the formula:

(n + r - 1)!

------------

r!(n - 1)!

where n is (count thing) and r is 2"

(let [! #(reduce * (take % (iterate inc 1)))

n (count thing)]

(/ (! (- (+ n 2) 1))

(* (! 2) (! (- n 1))))))

(extend-type String

Sliceable

(slice [this s e] (.substring this s (inc e)))

(sliceCount [this] (calc-slice-count this)))

(slice "abc" 0 1)

;=> "ab"

(sliceCount "abc")

;=> 6

The advantages of using definterface over defprotocol are restricted entirely to the

fact that the former allows primitive types for arguments and returns. At some point in

the future, the same advantages will likely be extended to the interfaces generated, so

use definterface sparingly and prefer protocols unless absolutely necessary.

10.7

Be wary of exceptions

There’s been much debate on the virtues of checked exceptions in Java, so we won’t

cover that here. Instead, we’ll stick to the facts regarding the nuances the JVM imposes

on Clojure’s error-handling facilities. Before we begin, consider the following view on

the use of exceptions in Clojure source:

When writing Clojure code, use errors to mean can’t continue and exceptions to mean

can or might continue .

We’ll attempt to constrain ourselves to the generalities of exception handling in this

section. If you desire information on deciphering exception messages, we talked about

that in section 3.4. If you’re curious about the effects of exceptions on continuation-

passing style, then refer back to section 7.3.4. We discussed the behavior of Clojure to

attempt to supplant numerical inaccuracies by throwing exceptions in section 4.1.3. If

you instead want to learn about the interplay between exceptions and Clojure’s refer-

ence types, then such matters can be found throughout chapter 11. Finally, if you have

no idea what an exception is, then we discuss the basics in section 1.5.8.

Download from Wow! eBook <www.wowebook.com>

230

CHAPTER 10 Java.next

10.7.1 A bit of background regarding exceptions

The behavior of Clojure’s exception features directly spawns from the JVM enforcing

the promulgation of checked exceptions. Virtuous or not in the context of Java devel-

opment, checked exceptions are antithetical to closures and higher-order functions.

Checked exceptions require that not only should the thrower and the party responsi-

ble for handling them declare interest, but every intermediary is also forced to partic-

ipate. These intermediaries don’t have to actively throw or handle exceptions

occurring within, but they must declare that they’ll be “passing through.” Therefore,

by including the call to a Java method throwing a checked exception within a closure,

Clojure has two possible alternatives:

 Provide a cumbersome exception declaration mechanism on every single func-

tion, including closures.

 By default, declare that all functions throw the root Exception or Runtime-

Exception.

And as you can probably guess, Clojure takes the second approach, which leads to a

condition of multilevel wrapping of exceptions as they pass back up the call stack. This

is why you see, in almost any (.printStackTrace *e) invocation, the point of origin of

an error offset by some number of layers of java.lang.RuntimeException. Because

Java interfaces and classes get to decide what types of problems potential derivative

classes and even callers can have, Clojure needs to handle the base

java.lang.Exception at every level, because it has to preserve dynamism in the face

of a closed system. Unless you’re directly calling something that throws typed excep-

tions, your best bet is to catch Exception and then see what you have in context.

10.7.2 Runtime versus compile-time exceptions

There are two contexts in Clojure where exceptions can be thrown: runtime and com-

pile time. In this section we’ll touch on both, explaining how and when to use them.

RUNTIME EXCEPTIONS

The case of runtime exceptions might be the most familiar, because it’s likely to have

been encountered and utilized in your own code. There are two types of runtime

exceptions: errors and exceptions. We can illustrate the difference between the two by

showing you the following:

(defn explode [] (explode))

(try (explode) (catch Exception e "Stack is blown"))

; java.lang.StackOverflowError

So why were we unable to catch the java.lang.StackOverflowError? The reason lies

in Java’s exception class hierarchy and the fact that StackOverflowError isn’t a deriv-

ative of the Exception class, but instead of the Error class:

(try (explode) (catch StackOverflowError e "Stack is blown"))

;=> "Stack is blown"

Download from Wow! eBook <www.wowebook.com>

Be wary of exceptions

231

(try (explode) (catch Error e "Stack is blown"))

;=> "Stack is blown"

(try (explode) (catch Throwable e "Stack is blown"))

;=> "Stack is blown"

(try (throw (RuntimeException.))

(catch Throwable e "Catching Throwable is Bad"))

;=> "Catching Throwable is Bad"

We started with a catch of the most specific exception type StackOverflowError and

gradually decreased specificity until catching Throwable, which as you’ll notice also

catches a RuntimeException. In Java, catching exceptions at the level of Throwable is

considered bad form, and it should generally be viewed the same in Clojure. There-

fore, we suggest that you follow the advice stated in the opening to this section and

reserve those deriving from Errors for conditions that can’t be continued from and

those from Exception indicating possible continuation.

COMPILE-TIME EXCEPTIONS

There are a few ways that you might come across compile-time exceptions, the most

obvious occurring within the body of a macro:

(defmacro do-something [x] `(~x))

(do-something 1)

; java.lang.ClassCastException:

; java.lang.Integer cannot be cast to clojure.lang.IFn

Though the type of the exception is a java.lang.ClassCastException, it was indeed

thrown by the compiler, which you’d see if you were to trace the stack using some-

thing like (for [e (.getStackTrace *e)] (.getClassName e)).10 It’s perfectly accept-

able (and even encouraged) to throw exceptions within your own macros, but it’s

important to make a distinction between a compile-time and runtime exception.

COMPILE-TIME EXCEPTIONS Why delay until runtime the reporting of an error

that at compile time you know exists?

The way to throw a compile-time exception is to make sure your throw doesn’t occur

within a syntax-quoted form, as we show in the following listing.

Listing 10.13 Throwing a compile-time exception

(defmacro pairs [& args]

(if (even? (count args))

`(partition 2 '~args)

(throw (Exception. (str "pairs requires an even number of args")))))

(pairs 1 2 3)

; java.lang.Exception: pairs requires an even number of args

(pairs 1 2 3 4)

;=> ((1 2) (3 4))

10 This is a limited analogy to Groovy’s .? operator. Clojure also provides convenience functions for displaying

and handling stack traces in the clojure.stacktrace namespace.

Download from Wow! eBook <www.wowebook.com>

232

CHAPTER 10 Java.next

Nothing is preventing the exception from being thrown at runtime, but because we

know that pairs requires an even number of arguments, we instead prefer to fail as

early as possible—at compilation time. This difference is clearly demonstrated by

repeating the preceding test in a function definition:

(fn [] (pairs 1 2 3))

; java.lang.Exception: pairs requires an even number of args

A runtime exception wouldn’t have been thrown until this function was called, but

because the pairs macro threw an exception at compile time, users are notified of

their error immediately. Though powerful, you should always try to balance the bene-

fits of compile-time error checking with macros and the advantages that implement-

ing as a function provides (the use in higher-order functions, apply, and so on).

10.7.3 Handling exceptions

There are two ways to handle exceptions and errors, each defined by the way in which

the error-handling mechanisms “flow” through the source. Imagine that you want a

macro that provides a limited11 null-safe (Koenig 2007) arrow that catches any occur-

rence of a NullPointerException in a pipeline:

(defmacro -?> [& forms]

exceptions

`(try (-> ~@forms)

(catch NullPointerException _# nil)))

(try

(-?> 25 Math/sqrt (+ 100))

(Math/sqrt 25)

;=> 105.0

100)

(catch NPE e nil))

(-?> 25 Math/sqrt (and nil) (+ 100))

;=> nil

binding

The flow of any occurrence of NullPointer-

Exception happens from the inner functions of the

(binding [handle prn]

stitched forms. Conceptually, this flow can be viewed

(try

as in figure 10.7, which describes the way that errors

(Math/sqrt 25)

can be caught depending on the direction in which

100)

(catch NPE e

data is moving along the stack.

(handle e))))

The typical (try ... (catch ...)) form would

therefore be used for the case where the handler

Figure 10.7 Outside-in and inside-

out error handling. There are two

catches errors bubbling outward from inner func-

ways to handle errors in Clojure. The

tions and forms, as seen in the -?> macro. But if you

typical way is to let exceptions flow

want to catch errors at their point of origin, you’ll

from the inner forms to the outer.

need a way to pass handlers up the stack. Fortunately,

The other way, discussed in section

13.4, uses dynamic bindings to

Clojure provides a way to do this via its dynamic Var

“reach into” the inner forms to

feature, which will be discussed in section 13.5.

handle them immediately.

11 There are much more comprehensive -?> and .?. macros found in the clojure.contrib.core

namespace, and those are recommended above the one in this section.

Download from Wow! eBook <www.wowebook.com>

Summary

233

10.7.4 Custom exceptions

If you’re inclined to write your own exception and error types, then you’ll need to do

so using the gen-class feature described in section 10.2. JVM exceptions again are a

closed system, and it might be better to explore other possibilities (Houser EK) for

reporting and handling errors in your Clojure code. But, should you wish to ignore

this advice, then bear in mind that it’s rare for Clojure core functions to throw excep-

tions, and even more rarely are they checked exceptions. The idiom is for Clojure to

throw derivatives of RuntimeException or Error, and thus your code should also strive

for this when appropriate.

10.8

Summary

Clojure provides an extensive set of data abstractions via its types and protocols. It

also provides an extensive interoperability facility through proxy, gen-class,

definterface, exception handling, and the implementation of core Java collection

interfaces. Though we stress that types and protocols will give you the performant

abstractions needed for solving most problems, we realize that not all interop scenar-

ios are solved this way. In these circumstances, you should use the features listed in

this chapter to push you the remainder of the way toward your solution. Clojure

embraces Java interoperability, but it does so in specific ways, and with a specific set

of tools.

In the next chapter, we move on to a rather complex topic, and one that Clojure

helps to simplify—shared state concurrency and mutation.

Download from Wow! eBook <www.wowebook.com>

Mutation

This chapter covers

 Software transactional memory with multiversion

concurrency control and snapshot isolation

 When to use Refs

 When to use Agents

 When to use Atoms

 When to use locks

 When to use futures

 When to use promises

 Parallelism

 Vars and dynamic binding

Clojure’s main tenet isn’t the facilitation of concurrency. Instead, Clojure at its core

is concerned with the sane management of state, and facilitating concurrent pro-

gramming naturally falls out of that. The JVM operates on a shared-state concur-

rency model built around juggling fine-grained locks that protect access to shared

data. Even if you can keep all of your locks in order, rarely does such a strategy scale

well, and even less frequently does it foster reusability. But Clojure’s state manage-

ment is simpler to reason about and promotes reusability.

234

Download from Wow! eBook <www.wowebook.com>

Software transactional memory

235

CLOJURE APHORISM A tangled web of mutation means that any change to

your code potentially occurs in the large.

In this chapter, we’ll take the grand tour of

the mutation primitives and see how Clojure

Concurrency 

makes concurrent programming not only

vs. parallelism

possible, but fun. Our journey will take us

Concurrency refers to the execu-

through Clojure’s four major mutable refer-

tion of disparate tasks at roughly

ences: Refs, Agents, Atoms, and Vars. When

the same time, each sharing a

common resource. The results of

possible and appropriate, we’ll also point out

concurrent tasks often affect the

the Java facilities for concurrent program-

behavior of other concurrent

ming (including locking) and provide infor-

tasks, and therefore contain an

mation on the trade-offs involved in

element of nondeterminism. Paral-

choosing them. We’ll also explore parallel-

lelism refers to partitioning a task

ism support in Clojure using futures, prom-

into multiple parts, each run at the

same time. Typically, parallel

ises, and a trio of functions pmap, pvalues,

tasks work toward an aggregate

and pcalls.

goal and the result of one doesn’t

Before we dive into the details of Clo-

affect the behavior of any other

jure’s reference types, let’s start with a high-

parallel task, thus maintaining

level overview of Clojure’s software transac-

determinacy.

tional memory ( STM ).

11.1 Software transactional memory with multiversion

concurrency control and snapshot isolation

Software transactional memory

A faster program that doesn’t work right is useless.

—Simon Peyton-Jones

in “Beautiful Concurrency”

In chapter 1, we defined three important terms:

 Time —The relative moments when events occur

 State —A snapshot of an entity’s properties at a moment in time

 Identity —The logical entity identified by a common stream of states occurring

over time

These terms form the foundation for Clojure’s model of state management and muta-

tion. In Clojure’s model, a program must accommodate the fact that when dealing

with identities, it’s receiving a snapshot of its properties at a moment in time, not nec-

essarily the most recent. Therefore, all decisions must be made in a continuum. This

model is a natural one, as humans and animals alike make all decisions based on their

current knowledge of an ever-shifting world. Clojure provides the tools for dealing

with identity semantics via its Ref reference type, the change semantics of which are

governed by Clojure’s software transactional memory; this ensures state consistency

throughout the application timeline, delineated by a transaction.

Download from Wow! eBook <www.wowebook.com>

236

CHAPTER 11 Mutation

11.1.1 Transactions

Within the first few moments of using Clojure’s STM, you’ll notice something different

than you may be accustomed to: no locks. Consequently, because there’s no need for

ad-hoc locking schemes when using STM, there’s no chance of deadlock. Likewise, Clo-

jure’s STM doesn’t require the use of monitors and as a result is free from lost wakeup

conditions. Behind the scenes, Clojure’s STM uses multiversion concurrency control

( MVCC ) to ensure snapshot isolation. In simpler terms, snapshot isolation means that

each transaction gets its own view of the data that it’s interested in. This snapshot is

made up of in-transaction reference values, forming the foundation of MVCC (Ullman

1988). As illustrated in figure 11.1, each transaction merrily chugs along making

changes to in-transaction values only, oblivious to and ambivalent about other transac-

tions. At the conclusion of the transaction, the local values are examined against the

modification target for conflicts. An example of a simple possible conflict is if another

transaction B committed a change to a target reference during the time that transac-

tion A was working, thus causing A to retry. If no conflicts are found, then the in-trans-

action values are committed and the target references are modified with their updated

values. Another advantage that STM provides is that in the case of an exception during

a transaction, its in-transaction values are thrown away and the exception propagated

outward. In the case of lock-based schemes, exceptions can complicate matters ever

more, because in most cases locks need to be released (and in some cases, in the cor-

rect order) before an exception can be safely propagated up the call stack.

Because each transaction has its own isolated snapshot, there’s no danger in retry-

ing—the data is never modified until a successful commit occurs. STM transactions

can easily nest without taking additional measures to facilitate composition. In lan-

guages providing explicit locking for concurrency, matters of composability are often

difficult, if not impossible. The reasons for this are far-reaching and the mitigating

forces (Goetz 2006) complex, but the primary reasons tend to be that lock-based

concurrency schemes often hinge on a secret incantation not explicitly understand-

able through the source itself: for example, the order in which to take and release a

set of locks.

11.1.2 Embedded transactions

In systems providing embedded transactions, it’s often common for transactions to be

nested, thus limiting the scope of restarts (Gray 1992). Embedding transactions within

Clojure operates differently, as summarized in figure 11.2.

In some database systems, transactions can be used to limit the scope of a restart as

shown when transaction embedded.b restarts only as far back as its own scope. Clojure

has but one transaction per thread, thus causing all subtransactions to be subsumed

into the larger transaction. Therefore, when a restart occurs in the (conceptual) sub-

transaction clojure.b, it causes a restart of the larger transaction. Though not shown,

some transaction systems provide committal in each subtransaction; in Clojure, com-

mit only occurs at the outermost larger transaction.

Download from Wow! eBook <www.wowebook.com>

Software transactional memory

237

Figure 11.1 Illustrating an STM retry: Clojure’s STM works much like a database.

embedded

clojure

Figure 11.2 Clojure’s embedded transactions: a restart in

restart

any of Clojure’s embedded transactions A, B, b, and C causes

a restart in the whole subsuming transaction. This is unlike a

fully embedded transaction system where the subtransactions

commit!

can be used to restrain the scope of restarts.

11.1.3 The things that STM makes easy

The phrase TANSTAAFL, meaning “There ain’t no such thing as a free lunch,” was

popularized in the excellent sci-fi novel The Moon Is a Harsh Mistress (Heinlein 1966)

and is an apt response to the view that STM is a panacea for concurrency complexities.

Download from Wow! eBook <www.wowebook.com>

238

CHAPTER 11 Mutation

As you proceed through this chapter, we urge you to keep this in the back of your

mind, because it’s important to realize that though Clojure facilitates concurrent pro-

gramming, it doesn’t solve it for you. But there are a few things that Clojure’s STM

implementation simplifies in solving difficult concurrent problems.

CONSISTENT INFORMATION

The STM allows you to perform arbitrary sets of read/write operations on arbitrary

sets of data in a consistent (Papadimitriou 1986) way. By providing these assurances,

the STM allows your programs to make decisions given overlapping subsets of informa-

tion. Likewise, Clojure’s STM helps to solve the reporting problem—the problem of

getting a consistent view of the world in the face of massive concurrent modification

and reading, without stopping (locking).

NO NEED FOR LOCKS

In any sized application, the inclusion of locks for managing concurrent access to

shared data adds complexity. There are many factors adding to this complexity, but

chief among them are the following:

 You can’t use locks without supplying extensive error handling. This is critical

in avoiding orphaned locks (locks held by a thread that has died).

 Every application requires that you reinvent a whole new locking scheme.

 Locking schemes often require that you impose a total ordering that’s difficult

to enforce in client code, frequently leading to a priority inversion scenario.

Locking schemes are difficult to design correctly and become increasingly so as the

number of locks grows. Clojure’s STM eliminates the need for locking and as a result

eliminates dreaded deadlock scenarios. Clojure’s STM provides a story for managing

state consistently. Adhering to this story will go a long way toward helping you solve

software problems effectively. This is true even when concurrent programming isn’t a

factor in your design.

ACI

In the verbiage of database transactions is a well-known acronym ACID, which refers to

the properties ensuring transactional reliability. Clojure’s STM provides the first three

properties: atomicity, consistency, and isolation. The other, durability, is missing due to the fact that Clojure’s STM resides in-memory and is therefore subject to data loss in the

face of catastrophic system failure. Clojure relegates the problem of maintaining dura-

bility to the application developer instead of supplying common strategies by default:

database persistence, external application logs, serialization, and so on.

11.1.4 Potential downsides

There are two potential problems inherent in STMs in general, which we’ll only touch

on briefly here.

WRITE SKEW

For the most part, you can write correct programs simply by putting all access and

changes to references in appropriately scoped transactions. The one exception to this

Download from Wow! eBook <www.wowebook.com>

Software transactional memory

239

is write skew, which occurs in MVCC systems such as Clojure’s. Write skew can occur

when one transaction uses the value of a reference to regulate its behavior but doesn’t

write to that reference. At the same time, another transaction updates the value for

that same reference. One way to avoid this would be to do a “dummy write” in the first

transaction, but Clojure provides a less costly solution: the ensure function. This sce-

nario is rare in Clojure applications, but possible.

LIVE-LOCK

Live-lock refers to a set of transaction(s) that repeatedly restart one another. Clojure

combats live-lock in a couple of ways. First, there are transaction restart limits that will

raise an error when breached. Generally this occurs when the units of work within

some number of transactions is too large. The second way that Clojure combats live-

lock is called barging. Barging refers to some careful logic in the STM implementation

allowing an older transaction to continue running while younger transactions retry.

11.1.5 The things that make STM unhappy

Certain things can rarely (if ever) be safely performed within a transaction, and in this

section we’ll talk briefly about each.

I/O

Any I/O operation in the body of a transaction is highly discouraged. Due to restarts,

the embedded I/O could at best be rendered useless, and cause great harm at worst.

It’s advised that you employ the io! macro whenever performing I/O operations:

(io! (.println System/out "Haikeeba!"))

; Haikeeba!

When this same statement is used in a transaction, an exception is thrown:

(dosync (io! (.println System/out "Haikeeba!")))

; java.lang.IllegalStateException: I/O in transaction

Though it may not be feasible to use io! in every circumstance, it’s a good idea to do

so whenever possible.

CLASS INSTANCE MUTATION

Unrestrained instance mutation is often not idempotent, meaning that running a set of

mutating operations multiple times often displays different results.

LARGE TRANSACTIONS

Though the size of transactions is highly subjective, the general rule of thumb when

partitioning units of work should always be get in and get out as quickly as possible.

Though it’s important to understand that transactions will help to simplify the

management of state, you should strive to minimize their footprint in your code. The

use of I/O and instance mutation is often an essential part of many applications; it’s

important to work to separate your programs into logical partitions, keeping I/O and

its ilk on one side, and transaction processing and mutation on the other. Fortunately

for us, Clojure provides a powerful toolset for making the management of mutability

Download from Wow! eBook <www.wowebook.com>

240

CHAPTER 11 Mutation

sane, but none of the tools provide a shortcut to thinking. Multithreaded program-

ming is a difficult problem, independent of specifics, and Clojure’s state-management

tools won’t solve this problem magically. We’ll help to guide you through the proper

use of these tools starting with Clojure’s Ref type.

11.2

When to use Refs

Clojure currently provides four different reference types to aide in concurrent pro-

gramming: Refs, Agents, Atoms, and Vars. All but Vars are considered shared refer-

ences and allow for changes to be seen across threads of execution. The most

important point to remember about choosing between reference types is that

although their features sometimes overlap, each has an ideal use. All the reference

types and their primary characteristics are shown in figure 11.3.

Ref Agent Atom Var

Figure 11.3 Clojure’s four reference types are listed across the

Coordinated

top, with their features listed down the left. Atoms are for lone

Asynchronous

synchronous objects. Agents are for asynchronous actions. Vars

Retriable

are for thread-local storage. Refs are for synchronously

Thread-local

coordinating multiple objects.

The unique feature of Refs is that they’re coordinated. This means that reads and writes

to multiple refs can be made in a way that guarantees no race conditions. Asynchronous

means that the request to update is queued to happen in another thread some time

later, while the thread that made the request continues immediately. Retriable indicates

that the work done to update a reference’s value is speculative and may have to be

repeated. Finally, thread-local means that thread safety is achieved by isolating changes

to state to a single thread.

Value access via the @ reader feature or the deref function provide a uniform client

interface, regardless of the reference type used. On the other hand, the write mecha-

nism associated with each reference type is unique by name and specific behavior, but

dothreads

To illustrate some major points, we’ll use a function dothreads! that launches a

given number of threads each running a function a number of times:

(import '(java.util.concurrent Executors))

(def *pool* (Executors/newFixedThreadPool

(+ 2 (.availableProcessors (Runtime/getRuntime)))))

(defn dothreads! [f & {thread-count :threads

exec-count :times

:or {thread-count 1 exec-count 1}}]

(dotimes [t thread-count]

(.submit *pool* #(dotimes [_ exec-count] (f)))))

The dothreads! function is of limited utility—throwing a bunch of threads at a func-

tion to see if it breaks.

Download from Wow! eBook <www.wowebook.com>

When to use Refs

241

similar in structure. Each referenced value is changed through the application1 of a

pure function. The result of this function will become the new referenced value.

Finally, all reference types allow the association of a validator function via set-

validator that will be used as the final gatekeeper on any value change.

11.2.1 Coordinated, synchronous change using alter

A Ref is a reference type allowing synchronous, coordinated change to its contained

value. What does this mean? By enforcing that any change to a Ref’s value occurs in a

transaction, Clojure can guarantee that change happens in a way that maintains a con-

sistent view of the referenced value in all threads. But there’s a question as to what

constitutes coordination. We’ll construct a simple vector of Refs to represent a 3 x 3

chess board:

(def initial-board

[[:- :k :-]

[:- :- :-]

[:- :K :-]])

(defn board-map [f bd]

(vec (map #(vec (for [s %] (f s))) bd)))

Just as in section 2.4, the lowercase keyword represents a dark king piece and the

uppercase a light king piece. We’ve chosen to represent the board as a 2D vector of

Refs (which are created by the board-map function). There are other ways to repre-

sent our board, but we’ve chosen this because it’s nicely illustrative—the act of moving

a piece would require a coordinated change in two reference squares, or else a change

to one square in one thread could lead to another thread observing that square as

occupied. Likewise, this problem requires synchronous change, because it would be

no good for pieces of the same color to move consecutively. Refs are the only game in

town to ensure that the necessary coordinated change occurs synchronously. Before

you see Refs in action, we need to define auxiliary functions:

(defn reset!

"Resets the board state. Generally these types of functions are a

bad idea, but matters of page count force our hand."

[]

(def board (board-map ref initial-board))

(def to-move (ref [[:K [2 1]] [:k [0 1]]]))

(def num-moves (ref 0)))

(def king-moves (partial neighbors

[[-1 -1] [-1 0] [-1 1] [0 -1] [0 1] [1 -1] [1 0] [1 1]] 3))

(defn good-move? [to enemy-sq]

(when (not= to enemy-sq) to))

(defn choose-move [[[mover mpos][_ enemy-pos]]]

[mover (some #(good-move? % enemy-pos)

(shuffle (king-moves mpos)))])

1 Except for ref-set on Refs, reset! on Atoms, and set! on Vars.

Download from Wow! eBook <www.wowebook.com>

242

CHAPTER 11 Mutation

The to-move structure describes the order of moves, so in the base case, it states that

the light king :K at y=2,x=1 moves before the dark king :k at y=0,x=1. We reuse the

neighbors function from section 7.4 to build a legal-move generator for chess king

pieces. We do this by using partial supplied with the kingly position deltas and the

board size. The good-move? function states that a move to a square is legal only if the

enemy isn’t already located there. The function choose-move destructures the to-

move vector and chooses a good move from a shuffled sequence of legal moves. The

choose-move function can be tested in isolation:

(reset!)

(take 5 (repeatedly #(choose-move @to-move)))

;=> ([:K [1 0]] [:K [1 1]] [:K [1 1]] [:K [1 0]] [:K [2 0]])

And now we’ll create a function to make a random move for the piece at the front of

to-move, shown next.

Listing 11.1 Using alter to update a Ref

(defn place [from to] to)

(defn move-piece [[piece dest] [[_ src] _]]

(alter (get-in board dest) place piece)

Place moving piece

(alter (get-in board src ) place :-)

Empty from square

(alter num-moves inc))

(defn update-to-move [move]

(alter to-move #(vector (second %) move)))

Swap

(defn make-move []

(dosync

(let [move (choose-move @to-move)]

(move-piece move @to-move)

(update-to-move move))))

The alter function appears four times within the dosync, so that the from and to posi-

tions, as well as the to-move Refs, are updated in a coordinated fashion. We’re using

the place function as the alter function, which states “given a to piece and a from

piece, always return the to piece.” Observe what occurs when make-move is run once:

(make-move)

;=> [[:k [0 1]] [:K [2 0]]]

(board-map deref board)

;=> [[:- :k :-] [:- :- :-] [:K :- :-]]

@num-moves

;=> 1

We’ve successfully made a change to two board squares, the to-move structure, and

num-moves using the uniform state change model. By itself, this model of state change

is compelling. The semantics are simple to understand: give a reference a function

that determines how the value changes. This is the model of sane state change that

Clojure preaches. But we can now throw a bunch of threads at this solution and still

maintain consistency:

Download from Wow! eBook <www.wowebook.com>

When to use Refs

243

(alter num-moves inc)

-- in-transaction --

(apply inc @num-moves)

(apply inc 9)

. . .

Figure 11.4 Alter path: the in-transaction value 9 for

the Ref num-moves is retrieved in the body of the

-- commit time --

transaction and manipulated with the alter function

(commit num-moves)

inc. This resulting value 10 is eventually used for the

commit-time value, unless a retry is required.

(defn go [move-fn threads times]

(dothreads! move-fn :threads threads :times times))

(go make-move 100 100)

(board-map #(dosync (deref %)) board)

;=> [[:k :- :-] [:- :- :-] [:K :- :-]]

@to-move

;=> [[:k [0 0]] [:K [2 0]]]

@num-moves

;=> 10001

Figure 11.4 shows that at the time of the transaction, the in-transaction value of the to

square is set to (apply place @SQUARE-REF PIECE). At the end of the transaction, the

STM uses this in-transaction value as the commit value. If any other transaction had

updated any other coordinated Ref before commit time, then the whole transaction

would be retried.

Clojure’s retry mechanism guarantees that the Refs in a transaction are always

coordinated upon commit because all other transactions line up waiting their turn to

commit their coordinated values. Look at what happens should the Ref updates hap-

pen in separate transactions:

(defn bad-make-move []

(let [move (choose-move @to-move)]

(dosync (move-piece move @to-move))

(dosync (update-to-move move))))

(go bad-make-move 100 100)

(board-map #(dosync (deref %)) board)

;=> [[:- :K :-] [:- :- :-] [:- :K :-]]

Clearly something has gone awry, and as we mentioned, the reason lies in splitting the

updates of the to and from Refs into different transactions. Being separated into two

transactions means that they’re (potentially) running on different timelines. Because

board and to-move are dependent, their states must be coordinated, but we’ve broken

that necessity with bad-make-move. Therefore, somewhere along the line board was

updated from two subsequent timelines where it was :K’s turn to move!

As shown in figure 11.5, either transaction can commit or be restarted; but because

the two Refs are no longer in the same transaction, the occurrences of these condi-

tions become staggered over time, leading to inconsistent values.

Download from Wow! eBook <www.wowebook.com>

244

CHAPTER 11 Mutation

Transaction for A

Transaction for B

-- in-transaction --

(apply f @A)

(apply f @B)

(apply f a)

(apply f b)

. . .

-- commit time --

(commit A)

RESTART

Figure 11.5 Splitting coordinated Refs: if Refs A and B should be coordinated,

then splitting their updates across different transactions is dangerous. Value a?

is eventually committed to A, but the update for B never commits due to retry and

coordination is lost. Another error occurs if B’s change depends on A’s value and

A and B are split across transactions. There are no guarantees that the dependent

values refer to the same timeline.

11.2.2 Commutative change with commute

Figure 11.4 showed that using alter can cause a transaction to retry if a Ref it

depends on is modified and committed while it’s running. But there may be circum-

stances where the value of a Ref within a given transaction isn’t important to its com-

pletion semantics. For example, the num-moves Ref is a simple counter, and surely its

value at any given time is irrelevant for determining how it should be incremented. To

handle these loose dependency circumstances, Clojure offers an operation named

commute. What if we were to change the make-move function to use the commute func-

tion instead of alter?

(defn move-piece [[piece dest] [[_ src] _]]

(commute (get-in board dest) place piece)

(commute (get-in board src ) place :-)

(commute num-moves inc))

(reset!)

(go make-move 100 100)

(board-map deref board)

;=> [[:K :- :-] [:- :- :-] [:- :- :k]]

@to-move

;=> [[:K [0 0]] [:k [2 2]]]

Everything looks great! But you can’t assume that the same will work for update-to-

move:

(defn update-to-move [move]

(commute to-move #(vector (second %) move)))

(go make-move 100 100)

(board-map #(dosync (deref %)) board)

;=> [[:- :- :-] [:- :K :-] [:- :- :K]]

@to-move

[[:K [2 2]] [:K [1 1]]]

Download from Wow! eBook <www.wowebook.com>

When to use Refs

245

(commute num-moves inc)

-- in-transaction --

(apply inc @num-moves)

(apply inc 9)

. . .

-- commit time --

Figure 11.6 Commute path: the in-transaction value 9 in the

(apply inc @num-moves)

13 num-moves Ref is retrieved in the body of the transaction and

manipulated with the commute function. But the commute

(apply inc 13)

function inc is again run at commit time with the current value

(commit num-moves)

14 13 contained in the Ref. The result of this action serves as the

committed value 14.

Thanks to our rash decision, we’ve once again introduced inconsistency into the system.

But why? The reason lies in the fact that the new update-to-move isn’t amenable to the

semantics of the commute function. commute allows for more concurrency in the STM by

devaluing in-transaction value disparity resulting from another transaction’s commit. In

other words, figure 11.6 shows that the in-transaction value of a Ref is initially set as

when using alter, but the commit time value is reset just before commute commits.

By retrieving the most current value for a Ref at the time of commit, the values

committed might not be those corresponding to the in-transaction state. This leads to

a condition of update reordering that your application must accommodate. Of course,

this new function isn’t commutative because vector doesn’t give the same answer if its

argument order is switched.

Using commute is useful as long as the following conditions aren’t problematic:

 The value you see in-transaction may not be the value that gets committed at

commit time.

 The function you give to commute will be run at least twice—once to compute

the in-transaction value, and again to compute the commit value. It might be

run any number of times.

11.2.3 Vulgar change with ref-set

The function ref-set is different from alter and commute in that instead of changing

a Ref based on a function of its value, it does so given a raw value:

(dosync (ref-set to-move '[[:K [2 1]] [:k [0 1]]]))

;=> [[:K [2 1]] [:k [0 1]]]

In general, this sort of vulgar change should be avoided. But because the Refs have

become out of sync, perhaps you could be forgiven in using ref-set to fix it—just this

once.

11.2.4 Fixing write-skew with ensure

Snapshot isolation means that within a transaction, all enclosed Ref states represent the

same moment in time. Any Ref value that you see inside a transaction will never change

unless you change it within that transaction. Your algorithms should be devised so that

all you care about is that the values of the references haven’t changed before commit

Download from Wow! eBook <www.wowebook.com>

246

CHAPTER 11 Mutation

(unless your change function is commutative, as mentioned previously). If those val-

ues have changed, then the transaction retries, and you try again. Earlier, we talked

about write skew, a condition occurring when you make decisions based on the in-

transaction value of a Ref that’s never written to, which is also changed at the same

time. Avoiding write skew is accomplished using Clojure’s ensure function, which

guarantees a read-only Ref isn’t modified by another thread. The make-move function

isn’t subject to write skew because it has no invariants on read data and in fact never

reads a Ref that it doesn’t eventually write. This design is ideal because it allows other

threads to calculate moves without having to stop them, while any given transaction

does the same. But in your own applications, you may be confronted with a true read

invariant scenario, and it’s in such a scenario that ensure will help.

11.2.5 Refs under stress

After you’ve created your Refs and written your transactions, and simple isolated tests

are passing, you may yet run into difficulties in larger integration tests because of how

Refs behave under stress from multiple transactions. As a rule of thumb, it’s best to

avoid having both short- and long-running transactions interacting with the same Ref.

Clojure’s STM implementation will usually compensate eventually regardless, but

you’ll soon see some less-than-ideal consequences of ignoring this rule.

To demonstrate this problem, listing 11.2 shows a function designed specifically to

over-stress a Ref. It does this by starting a long-running or slow transaction in another

thread, where work is simulated by a 200ms sleep, but all it’s really doing is reading

the Ref in a transaction. This requires the STM to know of a stable value for the Ref for

the full 200ms. Meanwhile, the main thread runs quick transactions 500 times in a

row, each one incrementing the value in the Ref and thereby frustrating the slow

transaction’s attempts to see a stable value. The STM works to overcome this frustra-

tion by growing the history of values kept for the Ref. But by default this history is lim-

ited to 10 entries, and our perverse function can easily saturate that:

(stress-ref (ref 0))

;=> :done

; r is: 500, history: 10, after: 26 tries

You may see a slightly different number of tries, but the important detail is that the

slow transaction is unable to successfully commit and print the value of r until the

main thread has finished its frantic looping and returned :done. The Ref’s history

started at a default of 0 and grew to 10, but this was still insufficient.

Listing 11.2 How to make a Ref squirm

(defn stress-ref [r]

(let [slow-tries (atom 0)]

(future

Long-running

(dosync

transaction

(swap! slow-tries inc)

(Thread/sleep 200)

@r)

Download from Wow! eBook <www.wowebook.com>

When to use Agents

247

(println (format "r is: %s, history: %d, after: %d tries"

@r (ref-history-count r) @slow-tries)))

(dotimes [i 500]

500 very quick

(Thread/sleep 10)

transactions

(dosync (alter r inc)))

:done))

Remember that our real problem here is mixing short- and long-running transactions

on the same Ref. But if this is truly unavoidable, Clojure allows us to create a Ref with

a more generous cap on the history size:

(stress-ref (ref 0 :max-history 30))

; r is: 410, history: 20, after: 21 tries

;=> :done

Again, your numbers may be different, but this time the Ref’s history grew sufficiently

(reaching 20 in this run) to allow the slow transaction to finish first and report about

r before all 500 quick transactions completed. In this run, only 410 had finished when

the slow transaction committed.

But the slow transaction still had to be retried 20 times, with the history growing

one step large each time, before it was able to complete. If our slow transaction were

doing real work instead of just sleeping, this could represent a lot of wasted comput-

ing effort. If your tests or production environment reveal this type of situation and the

underlying transaction size difference can’t be resolved, one final Ref option can

help. Because you can see that the history will likely need to be 20 anyway, you may as

well start it off closer to its goal:

(stress-ref (ref 0 :min-history 15 :max-history 30))

; r is: 97, history: 19, after: 5 tries

;=> :done

This time the slow transaction finished before even 100 of the quick transactions had

finished; and even though the history grew to roughly the same size, starting it off at

15 meant the slow transaction only retried 4 times before succeeding.

The use of Refs to guarantee coordinated change is generally simple for managing

state in a synchronous fashion, and tuning with :min-history and :max-history is

rarely required. But not all changes in your applications will require coordination,

nor will they need to be synchronous. For these circumstances, Clojure also provides

another reference type, the Agent, that provides independent asynchronous changes,

which we’ll discuss next.

11.3

When to use Agents

Like all Clojure reference types, an Agent represents an identity, a specific thing whose

value can change over time. Each Agent has a queue to hold actions that need to be

performed on its value, and each action will produce a new value for the Agent to

hold and pass to the subsequent action. Thus the state of the Agent advances through

time, action after action, and by their nature only one action at a time can be operat-

ing on a given Agent. Of course, other actions can be operating on other Agents at

the same time, each in its own thread.

Download from Wow! eBook <www.wowebook.com>

248

CHAPTER 11 Mutation

You can queue an action on any Agent by using send or send-off, the minor differ-

ence between which we’ll discuss later. Agents are integrated with STM transactions,

and within a transaction any actions sent are held until the transaction commits or are

thrown away if the transaction retries. Thus send and send-off are not considered side-

effects in the context of a dosync, because they handle retries correctly and gracefully.

11.3.1 In-process versus distributed concurrency models

Both Clojure and Erlang are designed (Armstrong 2007) specifically with concurrent

programming in mind, and Erlang’s process2 model is similar in some ways to Clojure

Agents, so it’s fair to briefly compare how they each approach the problem.

Erlang takes a distributed, share-nothing (Armstrong 2007b) approach; Clojure

instead promotes shared, immutable data. The key to Clojure’s success is the fact that

its composite data structures are immutable, because immutable structures can be

freely shared among disparate threads. Erlang’s composite data structures are also

immutable, but because the communication model is distributed, the underlying

theme is always one of dislocation. The implications of this are that all knowledge of

the world under consideration is provided via messages. But with Clojure’s in-process

model, data structures are always accessible directly, as illustrated in figure 11.7,

whereas Erlang makes copies of the data sent back and forth between processes. This

works well for Erlang and allows it to provide its fault recovery guarantees, but many

application domains can benefit from the shared-memory model provided by Clojure.

The second difference is that Erlang messages block on reception, opening up the

possibility for deadlock. On the other hand, when interacting with Clojure Agents,

both sends and derefs proceed immediately and never block or wait on the Agent.

Clojure Agent

Erlang Actor

Agent

Actor

inc

Agent

Actor

Agent

Actor

Figure 11.7 Clojure agents versus Erlang

deref

ref

get

processes: each Agent and process starts with 

the value 1. Both receive an inc request

Actor

simultaneously but can only process one at a time,

resp

so more are queued. Requests to the process are

queued until a response can be delivered, whereas

Actor

resp

any number of simultaneous derefs can be done 

on an Agent. Despite what this illustration may

suggest, an Agent is not just an actor with a hat on.

It’s interesting that popular opinion has tagged Erlang processes with the “actor” tag although the language

implementers rarely, if ever, use that term. Therefore, because the Erlang elite choose not to use that term,

we’ll avoid doing so also... almost.

Download from Wow! eBook <www.wowebook.com>

When to use Agents

249

Clojure does have an await function that can be used to block a thread until a partic-

ular Agent has processed a message, but this function is specifically disallowed in

Agent threads (and also STM transactions) in order to prevent accidentally creating

this sort of deadlock.

The final difference lies in the fact that Agents allow for arbitrary update func-

tions whereas Erlang processes are bound to static pattern-matched message han-

dling routines. In other words, pattern matching couples the data and update logic,

whereas the former decouples them. Erlang is an excellent language for solving the

extremely difficult problem of distributed computation, but Clojure’s concurrency

mechanisms service the in-process programming model more flexibly than Erlang

allows (Clementson 2008).

11.3.2 Controlling I/O with an Agent

One handy use for Agents is to serialize access to a resource, such as an file or other

I/O stream. For example, imagine we want to provide a way for multiple threads to report

their progress on various tasks, giving each report a unique incrementing number.

Because the state we want to hold is known, we can go ahead and create the Agent:

(def log-agent (agent 0))

Now we’ll supply an action function to send to log-agent. All action functions take as

their first argument the current state of the Agent and can take any number of other

arguments that are sent:

(defn do-log [msg-id message]

(println msg-id ":" message)

(inc msg-id))

Here msg-id is the state—the first time do-log is sent to the Agent, msg-id will be 0.

The return value of the action function will be the new Agent state, incrementing it to

1 after that first action.

Now we need to do some work worth logging about, but for this example we’ll just

pretend:

(defn do-step [channel message]

(Thread/sleep 1)

(send-off log-agent do-log (str channel message)))

(defn three-step [channel]

(do-step channel " ready to begin (step 0)")

(do-step channel " warming up (step 1)")

(do-step channel " really getting going now (step 2)")

(do-step channel " done! (step 3)"))

To see how log-agent will correctly queue and serialize the messages, we need to start

a few threads, each yammering away at the Agent, shown next:

(defn all-together-now []

(dothreads! #(three-step "alpha"))

(dothreads! #(three-step "beta"))

(dothreads! #(three-step "omega")))

Download from Wow! eBook <www.wowebook.com>

250

CHAPTER 11 Mutation

(all-together-now)

; 0 : alpha ready to being (step 0)

; 1 : omega ready to being (step 0)

; 2 : beta ready to being (step 0)

; 3 : alpha warming up (step 1)

; 4 : alpha really getting going now (step 2)

; 5 : omega warming up (step 1)

; 6 : alpha done! (step 3)

; 7 : omega really getting going now (step 2)

; 8 : omega done! (step 3)

; 9 : beta warming up (step 1)

; 10 : beta really getting going now (step 2)

; 11 : beta done! (step 3)

Your output is likely to look different, but one thing that should be exactly the same is

the stable, incrementing IDs assigned by the Agent, even while the alpha, beta, and

omega threads fight for control.

There are several other possible approaches to solving this problem, and it can be

constructive to contrast them. The simplest alternative would be to hold a lock while

printing and incrementing. Besides the general risk of deadlocks when a complex

program has multiple locks, there are some specific drawbacks even if this would be

the only lock in play. For one, each client thread would block anytime there was con-

tention for the lock, and unless some fairness mechanism were used, there’d be at

least a slight possibility of one or more threads being “starved” and never having an

opportunity to print or proceed with their work. Because Agent actions are queued

and don’t block waiting for their action to be processed, neither of these is a concern.

Another option would be to use a blocking queue to hold pending log messages.

Client threads would be able to add messages to the queue without blocking and with

adequate fairness. But you’d generally need to dedicate a thread to popping messages

from the queue and printing them, or write code to handle starting and stopping the

printing thread as needed. Why write such code when Agents do this for you already?

When no actions are queued, the Agent in our example has no thread assigned to it.3

Agents have other features that may or may not be useful in any given situation.

One is that the current state of an Agent can be observed cheaply. In the previous

example, this would allow us to discover the ID of the next message to be written out,

as follows:

@log-agent

;=> 11

Here the Agent is idle—no actions are queued or running, but the same expression

would work equally well if the Agent were running.

Other features include the await and await-for functions, which allow a sending

thread to block until all the actions it’s sent to a given set of Agents have completed.

Using Agents for logging might not be appropriate in all cases. For example, in probing scenarios, the num-

ber of log events could be extremely high. Coupling this volume with serialization could make the Agent

unable to catch its ever-growing queue.

Download from Wow! eBook <www.wowebook.com>

When to use Agents

251

This could be useful in this logging example if we wanted to be sure a particular mes-

sage had been written out before proceeding:

(do-step "important: " "this must go out")

(await log-agent)

The await-for function is similar but allows you to specify a number of milliseconds

after which to time out, even if the queued actions still haven’t completed.

A final feature Agents provide is that the set of actions you can send to an Agent is

open. You can tell an Agent to do something that wasn’t even conceived of at the time

the Agent was designed. For example, we could tell the Agent to skip ahead several

IDs, and this action would be queued up along with all the log-message actions and

executed by the Agent when its turn came:

(send log-agent (fn [_] 1000))

(do-step "epsilon " "near miss")

; 1000 : epsilon near miss

This is another area in which Clojure allows you to extend your design on the fly

instead of requiring recompiling or even restarting your app. If you’re paying atten-

tion, you might wonder why we used send in that last example rather than send-off.

11.3.3 The difference between send and send-off

You can use either send or send-off with any Agent. When you use send-off as we

did in most of the examples so far, only a single action queue is involved: the one man-

aged by the individual Agent. Anytime the Agent has a send-off action queued, it has

a thread assigned to it, working through the queue. With send, there’s a second

queue—actions still go into the Agent’s queue, but then the Agent itself queues up

waiting for a thread from a fixed-sized pool of threads. The size of this fixed pool is

based on the number of processors the JVM is running on, so it’s a bad idea to use

send with any actions that might block, tying up one of these limited number of

threads. These differences are illustrated in figure 11.8.

Idle agent. Empty queue,

no thread assigned.

Action queue

Unbounded

thread

send-off

pool

New agent state becomes

action applied to old state.

Figure 11.8 Agents using send versus send-off. When

Action queue

send

an Agent is idle, no CPU resources are being consumed.

Agent

queue

Each action is sent to an Agent using either send or

send-off, which determines which thread pool will be

The limited number of

used to dequeue and apply the action. Because actions

threads distribute CPU time

fairly among the queued

queued with send are applied by a limited thread pool, the

Bounded

agents.

thread

Agents queue up for access to these threads, a constraint

pool

that doesn’t apply to actions queued with send-off.

Download from Wow! eBook <www.wowebook.com>

252

CHAPTER 11 Mutation

We can make this scenario play out if we make a gaggle of Agents and send them

actions that sleep for a moment. Here’s a little function that does this, using which-

ever send function we specify, and then waits for all the actions to complete:

(defn exercise-agents [send-fn]

(let [agents (map #(agent %) (range 10))]

(doseq [a agents]

(send-fn a (fn [_] (Thread/sleep 1000))))

(doseq [a agents]

(await a))))

If we use send-off, all the agents will begin their one-second wait more or less simul-

taneously, each in its own thread. So the entire sequence of them will complete in

slightly over one second:

(time (exercise-agents send-off))

; "Elapsed time: 1008.771296 msecs"

Now we can demonstrate why it’s a bad idea to mix send with actions that block:

(time (exercise-agents send))

; "Elapsed time: 3001.555086 msecs"

The exact elapsed time you’ll see will depend on the number of processors you have,

but if you have fewer than eight you’ll see this example takes at least two seconds to

complete. The threads in the fixed-size pool are all clogged up waiting for sleep to

finish, so the other Agents queue up waiting for a free thread. Because clearly the

computer could complete all 10 actions in about one second using send-off, using

send is a bad idea.

So that’s it: send is for actions that stay busy using the processor and not blocking

on I/O or other threads, whereas send-off is for actions that might block, sleep, or

otherwise tie up the thread. This is why we used send-off for the threads that printed

log lines and send for the one that did no I/O at all.

11.3.4 Error handling

We’ve been fortunate so far—none of these Agent actions have thrown an exception.

But real life is rarely so kind. Most of the other reference types are synchronous and

so exceptions thrown while updating their state bubble up the call stack in a normal

way, to be caught with a regular try/catch in your application (or not). Because

Agent actions run in other threads after the sending thread has moved on, we need a

different mechanism for handling exceptions that are thrown by Agent actions. As of

Clojure 1.2, you can choose between two different error-handling modes for each

Agent: :continue and :fail.

:FAIL

By default, new Agents start out using the :fail mode, where an exception thrown by

an Agent’s action will be captured by the Agent and held so that you can see it later.

Meanwhile, the Agent will be considered failed or stopped and will stop processing its

Download from Wow! eBook <www.wowebook.com>

When to use Agents

253

action queue—all the queued actions will have to wait patiently until someone clears

up the Agent’s error.

One common mistake when dealing with Agents is to forget that your action func-

tion must take at least one argument for the Agent’s current state. For example, we

might try to reset the log-agent’s current message ID like this:

(send log-agent (fn [] 2000)) ; incorrect

@log-agent

;=> 1001

At first glance it looks like the action we sent had no effect, or perhaps hasn’t been

applied yet. But we’d wait in vain for that Agent to do anything ever again without

intervention, because it’s stopped. One way to determine this is with the agent-error

function:

(agent-error log-agent)

;=> #<IllegalArgumentException java.lang.IllegalArgumentException:

; Wrong number of args passed to: user$eval--509$fn>

This returns the error of a stopped Agent, or nil if it’s still running fine. Another way

to see whether an Agent is stopped is to try to send another action to it:

(send log-agent (fn [_] 3000))

; java.lang.RuntimeException: Agent is failed, needs restart

Even though this action would’ve worked fine, the Agent has failed and so no further

sends are allowed. The state of log-agent remains unchanged:

@log-agent

;=> 1001

In order to get the Agent back into working order, we need to restart it:

(restart-agent log-agent 2500 :clear-actions true)

;=> 2500

This resets the value of log-agent to 2500 and deletes all those actions patiently wait-

ing in their queue. If we hadn’t included the :clear-actionstrue option, those

actions would’ve survived and the Agent would continue processing them. Either way,

the Agent is now in good working order again, and so we can again send and send-

off to it:

(send-off log-agent do-log "The agent, it lives!")

; 2500 : The agent, it lives!

;=> #<Agent@72898540: 2500>

Note that restart-agent only makes sense and thus is only allowed when the Agent

has failed. If it hasn’t failed, any attempt to restart it throws an exception in the thread

making the attempt, and the Agent is left undisturbed:

(restart-agent log-agent 2500 :clear-actions true)

;=> java.lang.RuntimeException: Agent does not need a restart

Download from Wow! eBook <www.wowebook.com>

254

CHAPTER 11 Mutation

This mode is perhaps most appropriate for manual intervention. Agents that normally

don’t have errors but in a running system end up failing can use the :fail mode to

keep from doing anything too bad until a human can take things in hand, check to

see what happened, choose an appropriate new state for the Agent, and restart it just

as we did here.

:CONTINUE

The other error mode that Agents currently support is :continue, where any action

that throws an exception is skipped and the Agent proceeds to the next queued action

if any. This is most useful when combined with an error handler—if you specify an

:error-handler when you create an Agent, that Agent’s error mode defaults to

:continue. The Agent calls the error handler when an action throws an exception

and doesn’t proceed to the next action until the handler returns. This gives the han-

dler a chance to report the error in some appropriate way. For example, we could

have log-agent handle faulty actions by logging the attempt:

(defn handle-log-error [the-agent the-err]

(println "An action sent to the log-agent threw " the-err))

(set-error-handler! log-agent handle-log-error)

(set-error-mode! log-agent :continue)

With the error mode and handler set up, sending faulty actions does cause reports to

be printed as we wanted:

(send log-agent (fn [x] (/ x 0))) ; incorrect

; An action sent to the log-

agent threw java.lang.ArithmeticException: Divide by zero

;=> #<Agent@66200db9: 2501>

(send log-agent (fn [] 0)) ; also incorrect

; An action sent to the log-agent threw java.lang.IllegalArgumentException:

; Wrong number of args passed to: user$eval--820$fn

;=> #<Agent@66200db9: 2501>

And the Agent stays in good shape, always ready for new actions to be sent:

(send-off log-agent do-log "Stayin' alive, stayin' alive...")

; 2501 : Stayin' alive, stayin' alive...

Note that error handlers can’t change the state of the Agent (ours keeps its current

message id of 2501 throughout the preceding tests). Error handlers are also sup-

ported in the :fail error mode, but handlers can’t call restart-agent so they’re less

often useful for :fail than they are for the :continue error mode.

11.3.5 When not to use Agents

It can be tempting to repurpose Agents for any situation requiring the spawning of

new threads. Their succinct syntax and “Clojurey” feel often make this temptation

strong. But though Agents perform beautifully when each one is representing a real

identity in your application, they start to show weaknesses when used a sort of “green

thread” abstraction. In cases where you just need a bunch of worker threads banging

Download from Wow! eBook <www.wowebook.com>

When to use Atoms

255

away on some work, or you have a specific long-running thread polling or blocking on

events, or any other kind of situation where it doesn’t seem useful that the Agent

maintain a value, you’ll usually be able to find a better mechanism than Agents. In

these cases, there’s every reason to consider using a Java Thread directly, or a Java

executor (as we did with dothreads!) to manage a pool of threads, or in some cases

perhaps a Clojure future.

Another common temptation is to use Agents when you need state held but you

don’t actually want the sending thread to proceed until the Agent action you sent is

complete. This can be done by using await, but it’s another form of abuse that should

be avoided. For one, because you’re not allowed to use await in an Agent’s action, as

you try to use this technique in more and more contexts you’re likely to run into a sit-

uation where it won’t work. But in general, there’s probably a reference type that will

do a better job of behaving the way you want. Because this is essentially an attempt to

use Agents as if they were synchronous, you may have more success with one of the

other shared synchronous types. In particular, Atoms are shared and uncoordinated

just like Agents, but they’re synchronous and so may fit better.

11.4

When to use Atoms

Atoms are like Refs in that they’re synchronous but are like Agents in that they’re inde-

pendent (uncoordinated). An Atom may seem at first glance similar to a variable, but

as we proceed you’ll see that any similarities are at best superficial. The use cases for

Atoms are similar to those of compare-and-swap ( CAS ) spinning operations. Anywhere

you might want to atomically compute a value given an existing value and swap in the

new value, an Atom will suffice. Atom updates occur locally to the calling thread, and

execution continues after the Atom value has been changed. If another thread B

changes the value in an Atom before thread A is successful, then A retries. But these

retries are spin-loop and don’t occur within the STM, and thus Atom changes can’t be

coordinated with changes to other reference types. You should take care when embed-

ding changes to Atoms within Clojure’s transactions because as you know, transactions

can potentially be retried numerous times. Once an Atom’s value is set, it’s set, and it

doesn’t roll back when a transaction is retried, so in effect this should be viewed as a

side effect. Therefore, use Atoms in transactions only when you’re certain that an

attempt to update its value, performed numerous times, is idempotent.

Aside from the normal use of @ and deref to query an Atom’s value, you can also

use the mutating functions swap!, compare-and-set!, and reset!.

11.4.1 Sharing across threads

As we mentioned, Atoms are thread safe and can be used when you require a light-

weight mutable reference to be shared across threads. A simple case is one of a glob-

ally accessible incrementing timer created using the atom function:

(def *time* (atom 0))

(defn tick [] (swap! *time* inc))

(dothreads! tick :threads 1000 :times 100)

Download from Wow! eBook <www.wowebook.com>

256

CHAPTER 11 Mutation

@*time*

;=> 100000

Though this will work, Java already provides a concurrent class for just such a purpose,

java.util.concurrent.atomic.AtomicInteger, which can be used similarly:

(def *time* (java.util.concurrent.atomic.AtomicInteger. 0))

(defn tick [] (.getAndIncrement *time*))

(dothreads! tick :threads 1000 :times 100)

*time*

;=> 100000

Though the use of AtomicInteger is more appropriate in this case, the use of an

Atom works to show that it’s safe to use across threads.

11.4.2 Using Atoms in transactions

Just because we said that Atoms should be used carefully within transactions, that’s not

to say that they can never be used in that way. In fact, the use of an Atom as the refer-

ence holding a function’s memoization cache is idempotent on update.

MEMOIZATION Memoization is a way for a function to store calculated values in

a cache so that multiple calls to the function can retrieve previously calcu-

lated results from the cache, instead of performing potentially expensive cal-

culations every time. Clojure provides a core function memoize that can be

used on any referentially transparent function.

Individual requirements from memoization are highly personal, and a generic

approach isn’t always the appropriate solution for every problem. We’ll discuss per-

sonalized memoization strategies in section 12.4, but for now we’ll use an illustrative

example appropriate for Atom usage.

ATOMIC MEMOIZATION

The core memoize function is great for creating simple function caches, but it has

some limitations. First, it doesn’t allow for custom caching and expiration strategies.

Additionally, memoize doesn’t allow you to manipulate the cache for the purposes of

clearing it in part or wholesale. Therefore, we’ll create a function manipulable-

memoize that allows us to get at the cache and perform operations on it directly.

Throughout the book, we’ve mentioned Clojure’s metadata facility, and for this exam-

ple it will come in handy. We can take in the function to be memoized and attach

some metadata4 with an Atom containing the cache itself for later manipulation.

Listing 11.3 A resettable memoize function

(defn manipulable-memoize [function]

(let [cache (atom {})]

Store cache

(with-meta

in Atom

(fn [& args]

4 The ability to attach metadata to functions is a recent addition to Clojure version 1.2.

Download from Wow! eBook <www.wowebook.com>

When to use Atoms

257

(or (second (find @cache args))

Check cache first

(let [ret (apply function args)]

Else calculate, 

(swap! cache assoc args ret)

store, and return

ret)))

{:cache cache})))

Attach metadata

As shown in listing 11.3, we’ve slightly modified the core memoize function to attach

the Atom to the function being memoized. You can now observe manipulable-

memoize in action:

(def slowly (fn [x] (Thread/sleep 3000) x))

(time [(slowly 9) (slowly 9)])

; "Elapsed time: 6000.63 msecs"

;=> [9 9]

(def sometimes-slowly (manipulable-memoize slowly))

(time [(sometimes-slowly 108) (sometimes-slowly 108)])

; "Elapsed time: 3000.409 msecs"

;=> [108 108]

The call to slowly is always... well... slow, as you’d expect. But the call to sometimes-

slowly is only slow on the first call given a certain argument. This too is just as you’d

expect. Now we can inspect sometimes-slowly’s cache and perform some operations

on it:

(meta sometimes-slowly)

;=> {:cache #<Atom@e4245: {(108) 108}>}

(let [cache (:cache (meta sometimes-slowly))]

(swap! cache dissoc '(108)))

;=> {}

You may wonder why we used swap! to dissoc the cached argument 108 instead of

using (reset! cache {}). There are certainly valid use cases for the wholesale reset

of an Atom’s value, and this case is arguably one. But it’s good practice to set your ref-

erence values via the application of a function rather than the in-place value setting.

In this way, you can be more selective about the value manipulations being per-

formed. Having said that, here are the consequences our actions had:

(meta sometimes-slowly)

;=> {:cache #<Atom@e4245: {}>}

(time (sometimes-slowly 108))

; "Elapsed time: 3000.3 msecs"

;=> 108

And yes, you can see that we were able to remove the cached argument value 108

using the metadata map attached to the function sometimes-slowly. There are better

ways to allow for pointed cache removal than this, but for now you can take heart in

that using an Atom, we’ve allowed for the local mutation of a reference in a thread-

safe way. Additionally, because of the nature of memoization, you can use these

memoized functions in a transaction without ill effect. Bear in mind that if you do use

this in a transaction, then any attempt to remove values from the cache may not be

Download from Wow! eBook <www.wowebook.com>

258

CHAPTER 11 Mutation

met with the results expected. Depending on the interleaving of your removal and any

restarts, the value(s) you remove might be reinserted on the next time through the

restart. But even this condition is agreeable if your only concern is reducing total

cache size.

11.5

When to use locks

Clojure’s reference types and parallel primitives cover a vast array of use cases. Addi-

tionally, Java’s rich set of concurrency classes found in the java.util.concurrent

package are readily available. But even with this arsenal of tools at your disposal, there

still may be circumstances where explicit locking is the only option available, the com-

mon case being the modification of arrays concurrently. We’ll start with a simple pro-

tocol to describe a concurrent, mutable, safe array that holds an internal array

instance, allowing you to access it or mutate it safely. A naive implementation can be

seen in the following listing.

Listing 11.4 A simple SafeArray protocol

(ns joy.locks

(:refer-clojure :exclude [aget aset count seq])

(:require [clojure.core :as clj]))

(defprotocol SafeArray

Small set 

(aset [this i f])

of functions

(aget [this i])

(count [this])

(seq [this]))

(defn make-dumb-array [t sz]

(let [a (make-array t sz)]

(reify

SafeArray

(count [_] (clj/count a))

(seq [_] (clj/seq a))

aget and aset

(aget [_ i] (clj/aget a i))

are unguarded

(aset [this i f]

(clj/aset a i (f (aget this i)))))))