On Mini-languages and Clojure

Raju Gandhi
  • December 2010
  • Clojure
  • JVM

Clojure is a relatively new, dynamic Lisp that runs on the JVM. Clojure, being a Lisp, is extremely malleable and extensible, allowing Clojure the language and the programmer the ability to create powerful yet consistent abstractions. Clojure out-of-the-box comes with a set of these "mini-languages" and gives the programmer the ability to create new ones easily. In this article we will discuss some of these mini-languages, and how you can use them to write idiomatic Clojure code.

This article by Michael Fogus attempts to list several technical mini-languages that Clojure provides. Fogus is a well known Clojure hacker and enthusiast and author of The Joy of Clojure. We will take some of the more interesting of these and explore them in detail - I will provide some background and attempt to illuminate their idiomatic usage with some examples.

On a side note, if you have been exploring Clojure for a while, I strongly urge you to pick up The Joy of Clojure by Michael Fogus and Chris Houser. Most other books about Clojure in the market (perhaps with the exception of Clojure In Action by Amit Rathore) work at explaining the "what" and "how" of Clojure, The Joy of Clojure attempts to teach the "why" of Clojure. I recognize this is a bold statement, but I have certainly found that this book does a very good job of explaining "the Clojure way" and is a treat to read .

Mini-languages

A mini-language is much like an internal DSL, or an embedded DSL. A mini-language allows you, the developer, to create higher-level abstractions to express a problem space or a specific domain. This allows you to build the language up toward the problem space rather than tear down the problem to fit within the confines of the language. Paul Graham, a well known entrepreneur, venture capitalist, founder of the Y Combinator incubator, and a Lisp proponent, has a very nice essay explaining this premise, if you are so inclined.

It should be noted that a mini-language is not meant to be a complete language. In fact, most mini-languages are very context-specific, and target a specific spot within the language or domain. We explored one such use-case in one of my earlier articles with NFJS, The Magazine. In that article we wrote a DSL that allowed for the manipulation of a Finite State Machine. We used Groovy's malleable syntax to define the various states that the state machine is allowed to be in, and the events that cause it to transition from one state to another. Listing GAN-1 demonstrates this.

class GitFileStateMachine extends StateMachine {
  {
    gsmInitialState “unmodified”
          
    gsmState “unmodified”
    gsmState “modified”
    gsmState “staged”
          
    gsmEvent “edit”, {
      transitions from:”unmodified”, to:”modified”
    }
          
    gsmEvent “add”, {
      transitions from:”modified”, to:”staged”
    }
          
    gsmEvent “commit”, {
      transitions from:”staged”, to:”unmodified”
    }
  } 
}

Listing GAN-1

On the other hand, consider the enhanced for loop that was introduced in Java 5. This too is an example of a mini-language - one that lets you succinctly express the act of iterating over the items in an Iterable.

You might have already concluded that these two examples are slightly different. In one case we are using the language to build an abstraction to work with a particular domain more expressively, while in the second case, the mini-language is "baked" into its host. Regardless, the intent of the mini-language remains the same: it gives us the ability to convey our intent unambiguously.

So let's get on with it, shall we? Fire up your trusted Clojure repl so that you can follow along…

Destructuring

As Michael notes in his article, destructuring is one of the more comprehensive and powerful mini-languages that Clojure provides. This is an example of a mini-language that is "baked" into the language. Destructuring allows you to extract pieces and parts of a collection and bind them to local symbols. We will start with a simple case [See Listing GAN-2], and proceed to different use-cases. Note that although some of the examples use the let form, you can use destructuring anywhere you have a binding form.

; define a collection
(def breakfast ["scrambled eggs" "bacon" "coffee"])
          
; use destructuring to extract the pieces
(let [[main side caffeinated_drink] breakfast]
  (println "I had" main "with" side "and a" caffeinated_drink))
          
; output at the repl
;> "I had scrambled eggs with bacon and a coffee"
;> nil

Listing GAN-2

Notice that we name each item in the collection by listing their respective symbols within a vector inside the let's binding vector. The assignment happens sequentially across the items in the collection that you are destructuring. If there are fewer items in the collection than the number of bindings you provide, then the extra symbols will be nil, and if there are more, then they simply won't get bound.

Naturally this approach works well when you know exactly how many items are to be in a collection. More often than not, that is not the case. Furthermore, you are usually interested in the first item, or n number of items in the collection, as well as the rest of the items in the collection, or the whole collection. No worries, we’ve got you covered :) [See Listing-GAN3]

; define a collection
(def breakfast ["scrambled eggs" "bacon" "coffee" "orange juice"])
          
; use destructuring for a few pieces
; and capture the rest using the & 
(let [[main side & drinks] breakfast]
  (println "I had" main "with" side "and" (count drinks) "drinks"))
          
; output at the repl
;> I had scrambled eggs with bacon and 2 drinks
;> nil
          
; capture the entire collection as well using the :as keyword
(let [[main side & drinks :as breakfast-food] breakfast]
  (println "I had" main "with" side "and" (count drinks) "drinks")
  (println "A breakfast with" (count breakfast-food) "items is very  filling"))
          
;> I had scrambled eggs with bacon and 2 drinks
;> A breakfast with 4 items is very filling
;> nil

Listing GAN-3

I should point out two things of note here. First, although we are using a vector to define the items in our breakfast (which is the idiomatic approach to defining a list of items), this form of destructuring will work with any sequential construct, like lists, or even Strings! The second is a little more subtle. It does seem to appear that the & and :as keyword do somewhat similar things (that is, capture a portion or the whole collection vs. capturing distinct items); they do so in different ways. The & returns a seq view of the remaining items while the :as keyword keeps the type of collection being destructured untouched. For those who like to write code by manipulating the magnetic bits on their hard-drives, & uses the clojure.core/nthnext function to capture the remaining items. You can find its documentation here. See Listing GAN-4, where we explore this just a little bit more.

(let [[main side & drinks :as breakfast-food] breakfast]
  (println "class of the original collection:" (class breakfast))
  (println "class of the & binding:" (class drinks))
  (println "class of the :as binding:" (class breakfast-food)))
          
;> class of the original collection: clojure.lang.PersistentVector
;> class of the & binding: clojure.lang.PersistentVector$ChunkedSeq
;> class of the :as binding: clojure.lang.PersistentVector
;> nil

Listing GAN-4

That takes care of sequential artifacts. What about associative constructs, like maps? It just so happens that you can destructure maps using their keys. The syntax looks a little contrived at first, but you will soon see why this is actually a useful feature. Let's start with Listing GAN-5:

; define a map of breakfast-items
          (def breakfast-items {:main "scrambled eggs", :side "bacon", :drinks ["coffee" "orange juice"]})
          
          ; notice we are using a map inside the binding vector
          (let [{main :main side :side  drinks :drinks} breakfast-items]
             (println "I had" main "with" side "and" (count drinks) "drinks"))
          
          ;> I had scrambled eggs with bacon and 2 drinks
          ;> nil

Listing GAN-5

The thing to note here is that the local symbol is to the left, and the key that you are looking up is to the right, which is consistent with the usual binding forms that you see everywhere in Clojure: the symbol being assigned is to the left of the value it's being assigned to. But there is another reason which we will see in the following sections (See "Putting it Together" if you just can't wait).

If the verbosity of the code in Listing GAN-5 bothers you, look no further than Listing GAN-6. Clojure provides a :keys keyword to eliminate some of that.

; define a map of breakfast-items
(def breakfast-items {:main "scrambled eggs", 
                      :side "bacon", 
                      :drinks ["coffee" "orange juice"]})
          
; notice that we have the :keys keyword followed by a vector inside
; the binding vector
(let [{:keys [main side drinks]} breakfast-items]
  (println "I had" main "with" side "and" (count drinks) "drinks"))
          
;> I had scrambled eggs with bacon and 2 drinks
;> nil

Listing GAN-6

In this case rather than using a vector for destructuring, we are using a map. The :keys keyword tells Clojure to look up the keys in the map that have the same name as the ones listed in the vector that follow it, and bind it to local symbols with the same name. So in the case of [{:keys [main]}] Clojure will look for a key with the name main within the map, create a new symbol named main within the let scope, and associate the value of main from the map to this new symbol. This works well for the usual case when you know which keys should be supplied.

The associative destructuring also supports the :as keyword, which gives you a handle to the entire map. Use it to seek out key-value pairs for which you did not provide explicit destructuring. Furthermore, associative destructuring gives you another keyword - :or which lets you define a default value for a binding in case one or more keys do not exist in the map provided (without this the binding for that key would be nil). See Listing GAN-7 for a few examples.

; define a map of breakfast-items
(def breakfast-items {:main "scrambled eggs", 
                      :side "bacon", 
                      :drinks ["coffee" "orange juice"]})
          
; notice that we have the :or keyword followed by another map inside
; the binding map
; I realize I have a sweet tooth :) 
(let [{:keys [main side drinks dessert]
  :or {dessert "strawberry danish"}}
  breakfast-items]
  (println "I had" main "with" side "and" (count drinks) "drinks")
  (println "For dessert it was" dessert))
          
;> I had scrambled eggs with bacon and 2 drinks
;> For dessert it was strawberry danish
;> nil
          
; let us define a map with dessert included
(def breakfast-items-with-dessert {:main "scrambled eggs", 
                                   :side "bacon", 
                                   :drinks ["coffee" "orange juice"], 
                                   :dessert "pecan pie"})
          
; running the same let again
(let [{:keys [main side drinks dessert]
  :or {dessert "strawberry danish"}}
  breakfast-items-with-dessert]
  (println "I had" main "with" side "and" (count drinks) "drinks")
  (println "For dessert it was" dessert))
          
; we get
;> I had scrambled eggs with bacon and 2 drinks
;> For dessert it was pecan pie
;> nil
          
; throwing in the :as operator in the mix
(let [{:keys [main side drinks dessert]
  :or {dessert "strawberry danish"} :as items}
  breakfast-items-with-dessert]
  (println "I had" main "with" side "and" (count drinks) "drinks")
  (println "For dessert it was" dessert)
  (println "In all" (count items) "items including"
  (count drinks) "drinks"))
          
;> I had scrambled eggs with bacon and 2 drinks
;> For dessert it was pecan pie
;> In all 4 items including 2 drinks
;> nil

Listing GAN-7

Once again, there a few points to note. The destructuring for drinks should not surprise you - the drinks binding within the let is the vector of drinks (thus we can do a count drinks on it). Also, the :as works in a similar fashion to associative destructuring, giving you the entire map within scope. For those who have had experience with other Lisps this seems like a way to get named parameters in Clojure (albeit the poor man's version :D). Clojure has no support for named parameters, but you can achieve a similar effect with associative destructuring (a technique that Rails uses with much success since Ruby too does not support named parameters).

Putting It All Together

I mentioned earlier that the associative destructuring syntax seems a little contrived. But there is a hidden gem in there; this becomes apparent when you start to mix and match the various kinds of destructuring available to you. In listings GAN-6 and GAN-7 we had a vector of drinks being mapped to the :drinks key inside the map. What if you wanted to destructure that vector along with the key-value pairs themselves? See Listing GAN-8 to see how to do this.

; define a map of breakfast-items
(def breakfast-items {:main "scrambled eggs", 
                      :side "bacon", 
                      :drinks ["coffee" "orange juice"]})
          
; we are back to the verbose associative destructuring form
; notice we are further destructuring drinks as a sequential destructuring
(let [{main :main, side :side, [first_drink second_drink] :drinks}
  breakfast-items]
  (println "I had" main "with" side)
  (println "My first drink was" first_drink 
    "and my second was" second_drink))
          
;> I had scrambled eggs with bacon
;> My first drink was coffee and my second was orange juice
;> nil

Listing GAN-8

Perhaps now you see why this is a feature. Clojure can safely assume that anything to the left of the key name in the destructuring form is the symbol that is to be assigned. This could be the :keys keyword, or a symbol, or yet another destructuring form! The syntax remains consistent regardless of how you are using the destructuring form. (Hats off to The Joy of Clojure that explains this in great detail.)

There is a lot more to destructuring, but before we move on to the next mini-language, I should put out another disclaimer. Remember that you can use destructuring everywhere you can bind locals. This includes function signatures. Here, you have a choice - you could just choose to accept the collection as an argument and then destructure the argument inside the function body using a binding form, or you could expose the destructuring as part of the function signature. Consider the examples in Listing GAN-9.

; generic signature hiding how the coll is being used
(defn some-fn [coll]
  (let [[first second & rest] coll]
    ( ;body
    )))
          
; exposing the destructuring in the function signature
  (defn some-fn [[first second & rest]]
    ( ;body
    ))

Listing GAN-9

You should consider how much of the internal working of the function you wish you expose to the end user, and this is something you will need to do on a case-by-case basis. The first option buys you a lot more flexibility but you need to ensure that you have documented the function well, and the second one can make it easier for callers of your API to see what is expected.

List Comprehensions

If you have been playing with Clojure for a while, you might have noticed that it does not provide you with a for loop. For those new to Clojure, this might be a little bit surprising. But Clojure gives you something more powerful - list comprehensions. Essentially, list comprehensions are a construct that let you create new lists, or rather sequences, based off existing ones. If you deem yourself math savvy, then this article from Wikipedia may just be up your alley. Otherwise, keep reading …

List comprehensions can be a little hard to wrap your head around but are incredibly powerful. To use list comprehensions, you need one or more sequences that you are operating on and potentially some predicates, or conditionals that items in the newly created sequences must conform to. Clojure's list comprehensions support the :when and :while keywords to specify your predicates. The :when keyword, true to its name, filters out elements from the final sequence that do not meet a specific criteria. The :while keyword, on the other hand, is more of a go, no-go situation. The evaluation of the list comprehension is halted when the first element fails the :while predicate. That's it. Armed with this knowledge, consider Listing GAN-10 for a few examples to get started.

; simple list comprehension
; I realize it is very contrived :)
(for [x (range 10)] x)
          
;> (0 1 2 3 4 5 6 7 8 9)
          
; using the :when to filter out odd items
(for [x (range 10) :when (even? x)] x)
          
;> (0 2 4 6 8)
          
; using the :while keyword - Note that the evaluation
; stops when the first item, in this case 1 fails the predicate
(for [x (range 10) :while (even? x)] x)
          
;> (0)

Listing GAN-10

Notice that, every time, you get a sequence as the result of the evaluation. I mentioned earlier that you could have one or more sequences that you are operating on. See Listing GAN-11 for some examples.

; let us list all the highest powered cards in a deck
; start by defining the highest cards and all the suits
(def high-cards ["A" "K" "Q" "J"])
(def suits ["clubs" "diamonds" "hearts" "spades"])
          
(for [c high-cards s suits] (list c s))
          
; truncated for brevity
;> (("A" "clubs") ("A" "diamonds") ("A" "hearts") ("A" "spades") ("K" "clubs")
;> ... ("J" "spades"))

Listing GAN-11

If you look carefully at the output in the previous listing, you will notice that the first four items in the resultant sequence are all "Aces" (4 in total) followed by the "Kings" and so on and so forth. When working with multiple sequences, the for list comprehension starts with the right most sequence and works its way left. Another way to look at this is to think of the right most sequence as the "inner" loop. Once the inner loop is exhausted, Clojure moves to the next item in the outer (left) sequence and repeats.

The list comprehension supports one more keyword, :let which lets you bind other symbols within its context (as shown in Listing GAN-12).

Listing GAN-12

Using the :let keyword in this case makes the code, and our intent, clearer. Before I end this section on list comprehensions, I would like to point out a few things. One, you might be surprised to hear that list comprehensions in Clojure are not baked in, rather the for loop is the result of a macro expansion! Two, Clojure strives to be consistent, thus making the amount of context, and the number of rules that you need to keep in your head less. You will notice that the :let syntax is similar to other places where you might have used let. It's the let followed by a vector of bindings. You will find another great example if you are to compare the doseq construct with the for construct (Note that doseq is not a list comprehension). See Listing GAN-13 for an example.

; list all the numbers and their sums where the sum is less than 10
(doseq [a (range 10)
  b (range 10)
  :let [c (+ a b)]
  :when (> 10 c)]
  (println (list a b c)))
          
; truncated for brevity
;> (0 0 0)
;> (0 1 1)
;> (0 2 2)
;> (0 3 3)
;> (0 4 4)
;> (0 5 5)
;> ,,,

Listing GAN-13

doseq can be deemed equivalent to the "for each" that you may be used to in languages like Java. doseq runs immediately (which is why we println each list), while for creates a new sequence containing each of the new items (in line with the definition of a list comprehension). Furthermore, the do in doseq tells you that, if you are to have side-effects, doseq is the idiomatic way of doing it.

Pre and Post Conditions

Pre and Post conditions are a relatively new feature in Clojure and, as Michael correctly points out in his article, form the basis for contract programming. The pre and post conditions act as assert statements for your function, validating the input to and the return value from the function to verify that it conforms to a certain criteria. Let's start with an example (See Listing GAN-14).

; a function that calculates the square root of a number
(defn sq-root [x]
{:pre [(pos? x)]
  :post [(pos? %)]}
(Math/sqrt x))

Listing GAN-14

You can see that the pre and post conditions follow the argument vector of the function definition, and are declared as a map of keywords to vectors. (You can have more than one conditional check for each; just tack on more entries in the vectors). In Listing GAN-14 we are checking to see if the input value and the return value are positive numbers. Notice that we can capture the return value of the function using the % symbol. This is yet another example of how Clojure strives to be consistent in its syntax. The % sign, for those of you who have written anonymous functions (using the #() syntax), performs the same role - a means to capture a argument.

If a pre or a post condition were to fail, you would get an AssertionError, and the evaluation of the function would be aborted. Pah!, you say, there isn't anything here that a faithful assert can't do for me. Don't be quick to judge! We need to consider that the conditions are defined in a map, and this is one of those situations where Clojure's homoiconicity reveals its power. Think about it this way: what if you were to separate the creation of the conditional map from the function that it was applied to? It's simply a Clojure map. We could do that easily, right? Consider the examples in Listing GAN-15.

; constraints on positive numbers
(defn positives-only [f x]
  {:pre [(pos? x)]
    :post [(pos? %)]}
  (f x))
          
; apply those constraints to sq-root defined in Listing GAN-14
(positives-only sq-root 4)
;> 2.0
          
(positives-only sq-root -4)
;> AssertionError

Listing GAN-15

Notice how we decoupled the constraint itself from the function that we use it on. The constraints are no longer invasive on the function that is actually doing the heavy lifting, namely sq-roots. You could define other constraints and use them with your sq-root function, or define other mathematical functions and apply any of the constraints on them - mix and match, plug and play. Take it a step further: you could have a function that actually creates the constraint map based on certain arguments and invokes the delegate on your behalf! Can you say “aspects”?

The Thrush Twins...Kinda...

The thrush operators -> and ->>, in my opinion fill a very sweet spot within Clojure. Most Lisps inherently have code written "inside-out". See Listing GAN-16 for an example.

; calculating final amount due with simple interest
; for a principal of $1000 at 20% for 2 years
(def principal 1000)
(def rate 20)
(def years 2)
          
(+ principal (* principal (* (/ rate 100) years)))
;> 1400N

Listing GAN-16

In order to read the calculation in Listing GAN-16, you start with the inner-most nested form, and work your way outward. You first divide the rate by 100, multiply by the number of years, multiply by the principal, and then finally add it to the principal to get the amount you owe after 2 years. This is where the thrush operators can be handy. Thrush operators allow you to turn the code "outside-in", laying it out in a fashion that makes it easier to interpret the problem. See Listing GAN-17 to see how this works.

; calculating final amount due with simple interest
; for a principal of $1000 at 20% for 2 years
(def principal 1000)
(def rate 20)
(def years 2)
          
(->
(/ rate 100)
(* years)
(* principal)
(+ principal))

Listing GAN-17

The thrush operator threads the output of evaluating the first argument as the first argument of the second form, then takes the result of that evaluation and continues. Listing GAN-18 attempts to explain this by highlighting where values will be threaded using ",,,".

(->
(/ rate 100)       ;             => 0.2
(* ,,, years)      ;(* 0.2 2)    => 0.4
(* ,,, principal)  ;(* 0.4 1000) => 400
(+ ,,, principal)) ;(+ 400 1000) => 1400

Listing GAN-18

The double-thrash, that is ->> does something similar (See Listing GAN-19), except rather than threading the evaluation of the previous form as the first argument to the next, it threads it as the last. In our chosen example it does not change the output (since the last three forms are commutative mathematical operations). It goes without saying that this may not always be the case, for example if you had a division operation.

(->>
(/ rate 100)       ;             => 0.2
(* years ,,,)      ;(* 0.2 2)    => 0.4
(* principal ,,,)  ;(* 0.4 1000) => 400
(+ principal ,,,)) ;(+ 400 1000) => 1400

Listing GAN-19

A couple of things to note. The thrush operators, again, are just macros. Consequently, this has inspired other similar mini-languages that do similar but slightly different things (like allowing for nils). For example, check out this library currently in Clojure Contrib. Finally, there is a caveat: although thrush operators are really cool, they don't always do what you think they might do. See this article (again by Michael Fogus) that discusses this and gives you an alternative form that you can consider.

Thrush operators can make your Clojure code less Lispy (gasp!). In all seriousness thrush operators can make your code more readable and easier to follow.

So Many More, So Little Time

Michael's blog post mentions several other mini-languages within Clojure. I, humbly, propose one more: Clojure's Java interop operators. Of course, there is the grand-daddy of them all, the macro, and it's very own mini-language, which makes a lot of the other mini-languages possible. Macros are a fundamental construct to all Lisps and Lisp's consistent syntax gives them their awe-inspiring and mind-bending prowess. Macros deserve a discussion of their own, but unfortunately, we have run out of time. Perhaps some other time?

Conclusion

Clojure's mini-languages aim to solve very specific concerns. Their use in Clojure is wide-spread, even in places where you would not expect them. Using these mini-languages makes your life as a Clojure developer easier and your code consistent and easier to read and maintain – all while being idiomatic Clojure code. Clojure encourages the creation of other mini-languages with the use of macros. I will leave you with one question: can you think of any other places in Clojure where a mini-language would be useful? Till next time …

Share