On Writing a Groovy DSL

Raju Gandhi
  • July 2010
  • Groovy

Today, Groovy is a mature language on the JVM that gives Java developers a dynamic, flexible and highly productive and expressive medium while allowing seamless integration with existing Java applications and libraries. One facility that Groovy lends itself very well to is the creation of DSLs. A DSL (Domain Specific Language) is a language that has been built to express a specific domain, and one that allows for a rich vocabulary that can be shared by programmers and business experts. A DSL offers a higher level of abstraction than the host language that it's written in. Groovy's rich yet liberal syntax makes it a perfect candidate for taking on such an endeavor.

In this article we will attempt to port a small part of a Ruby library called 'Acts As State Machine' (AASM) to Groovy. AASM is a library that allows for the creation and manipulation of a finite state machine while providing a very nice DSL. This will allow us to explore some of the steps and pitfalls that you might encounter when writing your very own DSL. You can find the source code for the Groovy State Machine library on the Integrallis GitHub account.

A Gentle Introduction to Finite State Machines

If you attended a Computer Science class and the mention of Finite State Machine (FSM) has you conjuring up images of complex flow diagrams and mathematical terminology, don't fret. Conceptually, a Finite State Machine consists of 3 basic constructs:

  • "states" that the machine can be in
  • "transitions" that cause the system to go from one state to another
  • "events" that fire the rules and cause the system to transition

That's it. Consider the simple example of a FSM modeling the stages that a file goes through when tracked by Git (a distributed source-code version control system). Git employs a two-phase commit system - a file that has been modified or created is first added to a "staging area" which represents what the next commit looks like. A commit essentially adds all the changes in the staging area to the "master", or the main trunk. Each file has an initial state of "unmodified". When the user "edits" a file, this event causes the file to transition to "modified". When the user "add"s the file, it transitions to "staged", and on "commit" transitions back to "unmodified" (see Figure GAN-1). Keep this example in mind as you read further on, because we will be using this example in the code samples.

Figure gan 1

Figure GAN-1

A Sneak-peek at the DSL

AASM provides a very nice DSL in Ruby to interact with a FSM and we will attempt to reproduce the same in Groovy albeit with a few differences. These differences are because of a slightly different object system between Ruby and Groovy, but for the most part, the DSLs should look and behave in a similar fashion. Listing GAN-1 demonstrates how we can use the Groovy State Machine library to create a class that models the states a file transitions through when managed by Git (as described in the previous section).

class GitFileStateMachine extends StateMachine {
  // note we are doing this inside an instance-initializer block
  // this block, like Java will be run right after 
  // the constructor is called
  {
    gsmInitialState "unmodified" // initial state of the FSM
          
    // all the other states the FSM is allowed to be
    // these make the universal set of allowed states
    // for this state machine
    gsmState "unmodified"
    gsmState "modified"
    gsmState "staged"
          
    //define the events – 
    // notice the comma after the name of the event
    gsmEvent "edit", {
      transitions from:"unmodified", to:"modified"
    }
          
    gsmEvent "add", {
      transitions from:"modified", to:"staged"
    }
          
    gsmEvent "commit", {
      transitions from:"staged", to:"unmodified"
    }
  } // instance-initializer block ends here
}

Listing GAN-1

As you can see, the DSL provides a nice intuitive interface to declare states, events and transitions. But there is more to it! The DSL will also introduce additional methods allowing the user to interact, manipulate and query the state-machine. Listing GAN-2 shows off some of the additional functionality that the corresponding objects will expose.

// first create an instance
gitModel = new GitFileStateMachine()
// a method injected to test for initial state
assert true == gitModel.isUnModified() 
// fire edit event
gitModel.doEdit() 
// using Java's getter syntax
assert "modified" == gitModel.getCurrentState() 
// applying Groovy syntactic sugar
assert "modified" == gitModel.currentState 
// another method injected to test for current state
assertTrue gitModel.isModified() 
assert 2 == gitModel.states.size()

Listing GAN-2

Now that you have had a sneak peek at the usage of the library, let’s sit back and look at some of the design and implementation steps you might want to consider when writing your own DSL.

Knowing Where You are Going

When writing a DSL, it often helps to know what the final result will look like. In the case of porting an existing library, it’s much simpler since you already have a good starting point, but this is usually not the case.

A DSL's role is to present an interface that is an abstraction higher (or lower depending on your perspective) from that of the programming language that most developers are used to. The objective in writing a DSL is to work towards a more "natural" language, usually one that closely maps to the domain that you are working with. The end objective may be to establish a consistent vocabulary that everybody speaks and understands, or have business experts and/or testers more involved in the development/testing life-cycle. Furthermore, often times a DSL is specifically designed to be both expressive and succinct while doing everything that the end user needs it to do.

As you can see from our state machine example in Listing GAN-1, the intent there is to make the language feel less Groovy-like and more English-like. But there lies a huge divide between the strict rules that govern programming languages and a context dependent natural language like English. If the end result of the DSL is fuzzy, you may end up biting off more than you bargained for. Furthermore, you may end up exposing more than you originally intended (See "The Law of Leaky Abstractions"). Occasionally it might help to sit down with a piece of paper and pen and hash out what the final DSL will look like.

There is another benefit to doing this. Sometimes, the tool that you are leveraging to write your DSL may not be capable of doing what you want it to do. Consider the following snippet of code in Listing GAN-3.

// define the events - notice the comma after the name of the event, 
// which in this case is “edit”
gsmEvent "edit", {
  transitions from:"unmodified", to:"modified"
}

Listing GAN-3

The comma after the edit is so that Groovy treats the String and the Closure as two arguments to the gsmEvent method. This is a consequence of how Groovy parses the code. An alternative to this would be to write the code –as in Listing GAN-4.

// define the events - no comma, but the name has to be enclosed in parentheses
gsmEvent("edit") {
  transitions from:"unmodified", to:"modified"
}

Listing GAN-4

I recognize that this is a trivial example, but scenarios like these may present themselves. If none of the solutions (like the ones above) are acceptable to you, then perhaps Groovy isn't the tool to use in this case. Alternatively, if you prefer to use Groovy, you know your options early on (In this case Listing GAN-3 or GAN-4) without having to invest a lot of time and effort.

Exploring the Landscape

Languages like Groovy offer a very rich syntax and great pliability, while lending themselves to a very quick try-evaluate feedback loop. Highlighting the latter, you have two options, and you must exercise both of them - the Groovy Console (or shell) and testing. Attempting to see or understand how a feature of the language works is a matter of firing up the interactive shell and running some quick and dirty code. This isn't only applicable to exploring a feature, but also to seeing how different features of the language work together (e.g. having a closure supplied as an argument to a method in conjunction with named arguments, as in Listing GAN-8). It might be tempting to squeeze in something like the aforementioned example in a growing code-base and hope that your integration tests catch any errors, but this proves to be both difficult and unreliable. You would be attempting to understand how that feature is playing with everything else while keeping an eye on all the moving parts at the same time. Writing the simplest code possible with the least amount of environmental baggage to investigate how something works will save you lots of grief. Once you know how a feature works, and you have plugged it in your code, be sure to take the code samples you just punched out on the interactive shell and make appropriate tests out of them. Two for the price of one, yes?

Occasionally, you may come across a (new) feature in a language and it may not jump out at you as being very useful. Consider the with (also known as identity) method, defined as an extension method within Groovy's GDK for Object . This is defined as in Listing GAN-5.

public Object with(Closure closure)

Listing GAN-5

Most code samples on the web demonstrate the use of with as shown – in Listing GAN-6.

def calendar = Calendar.instance
calendar.with {
  clear()
  set MONTH, JULY
  set DATE, 4
  set YEAR, 1776
}

Listing GAN-6

Unfortunately, it's not clear how this works from this example. Essentially, all the method invocations within the closure argument happen within the context of the containing Calendar object. It's the same as if you were to call each of those methods –as in Listing GAN-7.

def calendar = Calendar.instance
calendar.clear()
calendar.set(MONTH, JULY)
calendar.set(DATE, 4)
calendar.set(YEAR, 1776)

Listing GAN-7

Shown in Listing GAN-8 is how we are leveraging with when we implemented how to declare events for the state machine library.

class Event {
  Event(args = [:], String name, Closure c) {
    // some code here 
    if(c) this.with(c)  
  }
                 
  private void transitions(args) {
    // some code here
  }
}

Listing GAN-8

Listing GAN-9 shows how the class in Listing GAN-8 is used.

// this method is declared inside StateMachine.groovy
             
def gsmEvent(options = [:], name, Closure transitions) {
  if(!events[name]) {
    events[name] = new Event(options, name, transitions)
  }
}

Listing GAN-9

Listing GAN-10 demonstrates the invocation of the method in Listing GAN-9.

gsmEvent "stage", { 
  transitions from:"modified", to:"staged"
}

Listing GAN-10

Notice that the last argument passed to the Event constructor from within the gsmEvent method is a Closure that internally references a method named transitions. Inside the Event constructor, calling with with that closure causes that method to be executed within the context of the newly created Event object, thus calling the private transitions method.

In conclusion, learning what a language has to offer, and how it can be leveraged is the result of some experimentation, and lightweight tools like an interactive environment and tests are your friends. Furthermore, they need not be your tests! The Groovy source code is full of good examples, along with a whole test suite. Dive in! There truly is a pot of gold at the end of the green bar.

The Glow of Inspiration May Be Coming From Behind You

Granted, once you write a little bit of Groovy code, Java begins to feel like a strait-jacket, where the compiler only seems to tighten its stranglehold over you with the red squiggly lines emitted by your IDE of choice. But there is a glimmer of hope. Like I said earlier, you can find inspiration in the form of good code samples in the Groovy source code, but you might find it in places that you seemingly have left behind.

Consider the snippet of code –in Listing GAN-11.

class GitFileStateMachine extends StateMachine {
  // note we are doing this inside an instance-initializer block
  // this block, like Java will be run right after
  // the constructor is called
  {
    // declare your states and events here
  }
  // instance-initializer block ends here
}

Listing GAN-11

The block of code in Listing GAN-11 that initializes the states and the events is surrounded by curly brackets. For those familiar with Java's static block, this is essentially the same thing, except this is usually referred to as an instance initializer - that's right, a block of code that gets executed after a constructor call. You could achieve the same effect by putting these lines of code inside your constructor, but it would detract from the English-like flow of the DSL.

It's prudent to remember that Groovy attempts to be an extension of the Java language, and all of the features that Java offers are available to you within Groovy. Taking a step back, and leveraging the entire ecosystem may prove beneficial.

Conclusion

Groovy's flexible structure, along with syntactic sugar can help create very intuitive DSLs that can make working with libraries and applications easier. Try to chart your path, get to know your tools and attempt to synthesize a solution using both Groovy's language features and development kit. Whether it is to make a task more fun, your fellow programmers more productive, or to allow for an easier feed-back loop between technical and non-technical staff, DSLs certainly have their place, and Groovy is certainly an ally. Ready, set, go DSL'ing!

Share