First week at Runa

During my last week at ThoughtWorks, my colleagues were asking me what I would be doing at my new job, and what technologies I might be working on.

I told them that the job involved Rails for the consumer-focused piece, and a combination of other technologies at the backend – Ruby, Java (along with Hadoop), Erlang, and possibly others depending on the task at hand. I wasn’t sure if they were impressed or not, but they did joke about how I would actually end up working on PHP and ColdFusion. I indignantly denied any such thing.

So of course, my first week at the new job saw me working on PHP and ColdFusion.
😐

Basically, since a part of Runa is a set of services for merchants, the consumers of these services could be using PHP, ColdFusion, or any other platform. Hence the need for us to ensure our code works with all of them, and indeed, for us to develop libraries for every major e-commerce platform out there. Anyway, that’s what I worked on through Week 1.

Heh, just thought this was funny!

Easy external DSLs for Java applications

Or JRuby for fun and profit

I’ve been developing software for some time now, and have recently found myself applying ideas from various esoteric areas of computer-science to every day tasks of building common-place applications (often these concepts are quite old, indeed some were thought of in 1958).

One powerful idea that I’ve been playing with recently is that of embedding a DSL (domain specific language) into your basic application framework – and writing most of the features of the application in that DSL.

This is simply an implementation of the concept of raising the level of abstraction. The point, of course, being that when writing code in a DSL implemented in such a fashion, one can express ideas in terms of high-level abstractions that represent actual concepts from the problem domain. In other words, it is like using a programming language that has primitives rooted in the domain.

A lot of people have been writing about this kind of software design – and most implement these ideas in a dynamic language of their choice. How does one go about doing the same in a language like Java? That is what this article is about. And I cheat in my answer. Consider the following design stack –

Creating DSLs in JRuby

I propose that you only implement basic and absolutely required pieces of functionality in Java – the things that rely on, say, external systems that expose a Java interface, or some EJB-type resource, or some other reason that requires the use of Java. The functionality you develop here is then exposed through an interface to the layers above. You can also add APIs for other support services you might need.

The layer above is a bunch of JRuby code that behaves as a facade to the Java API underneath. This leaves you with a Ruby API to that underlying Java (and other whatever, really!) stuff – and makes it possible to code against that functionality in pure Ruby. The JRuby interpreter runs as part of the deployable and simply executes all that Ruby code transparently. As far as your Ruby code is concerned, it doesn’t even care that some of the calls are internally implemented in Java. Sweet!

We can stop here. At this point, we are in a position to write the rest of our application in a nice dynamic language like Ruby. For some people, a nice fluent interface in Ruby suffices – and keeps all developers happy. This is depicted on the top-left part of the diagram.

However, we can go one step further, and implement a DSL in Ruby that raises the level of abstraction even more – as referred to earlier. So on top of the Ruby layer, you’d implement a set of classes that allow you to write code in simple domain-like terms, which then get translated into a form executable by the Ruby interpreter. This is shown in the top right part of the diagram. Ultimately, potentially any one (developers, QA or business analysts) could express their intent in something that looked very much like English.

So where to write what?

How much to put in your Java layer depends on the situation – some people (like me) prefer to write as little as possible in such a static language. Others like the static typing and the associated tool support, and prefer to put more code here. When merely shooting for a little bit of dynamism through the DSL engine in the layers above, most of the code could be written in Java, and a fluent API in the dynamic language could be enough. When shooting for rapid feature turn-around and a lot more flexibility, most of the code could be in the DSL or in the external dynamic language.

The answer to this question really depends on things like the requirements, team structure, skill-sets, and other such situational factors.

OK, so where’s the code?

My intention with this post was to stay at a high level – and talk of how one could structure an application to make it possible to embed a scripting engine into it, and to give an overview of the possibilities this creates. In subsequent posts, I will talk about how actual DSLs can be created, tested, and also how a team might be structured around it.

From freedom languages to Java (and back again?)

Recently, I started working with Java again. I had little choice in the matter, really, since its for an upcoming product in the mobile application development tools space, and I’m focusing on the Java Micro Edition area. I’ll have more to say on this skunkworks initiative another time. (Watch this space, and all that).

I’ve been using mostly other languages in the recent past, Ruby, a little Python, Common Lisp, a little Haskell. But mostly Ruby. And it seems that having stayed Java-free for about two years has made me really rusty. That apart, this time around Java started out feeling annoying, and morphed into being mostly amusing. In an annoying way. The question I constantly have to tell myself to refrain from asking (out loud, and to the world in general) is – “Why can’t the bloody runtime figure this out for itself? Why do I have to type this extra (vestigial) code?”

In any case, working on the new Java Micro Edition platform again is nice – reminds me of a project I did at college – and of simpler times… 🙂

Ruby, managing global variables, and dynamic scope

Everyone knows that global variables are bad. However, they are quite unavoidable. Java’s System.out is an example.

Globals, sometimes offer a certain kind of flexibility. They offer a way to tweak the behavior of the entire (or a subset of) the system. For instance, one can redirect System.out to a different stream (into a file, say) and capture all the messages a program spits out.

This kind of stuff works nicely. Except when someone else goes and changes the same variable to something else in another part of the code, and you’ve no idea where.

What is needed, is something like optional dynamic scope, so that the globals can be used when needed, but can be managed better. After all, from the above example, what seems to be needed is a way for a piece of code to say – for my purposes, and for all code that runs when I’m called, I want the value of this global(s) to be _something_, and when I return, these globals should be reset.

This can be done manually, by saving the existing value of a global before setting it to something else, and then resetting it back when the code block completes running. Perl and Common Lisp have had a mechanism to do this type of stuff for a long time, built into the language.

Here’s a hacky (and probably naive) way to implement this in Ruby, to illustrate how this might work –


module Let
def let(bindings, &block)
old_bindings = capture_existing_bindings_for(bindings)
block.call
rehydrate_old_bindings_with(old_bindings)
end
def capture_existing_bindings_for(bindings)
old_bindings = { }
bindings.each do |k, v|
old_bindings[k] = eval "@"+k.to_s
create_binding k, v
end
return old_bindings
end
def create_binding(var_name, value)
instance_eval "@#{var_name}=value"
end
def rehydrate_old_bindings_with(old_bindings)
old_bindings.each do |k, v|
create_binding k, v
end
end
end

And the way you would use this, would look something like this –


require 'let_module'
include Let
@num_var = 1
@char_var = 'a'
class Car
attr_reader :name
def initialize(name)
@name = name
end
end
@obj_var = Car.new("hyundai")
def do_something
puts "num_var is " + @num_var.to_s + ", char_var is '" + @char_var.to_s + "', obj_var is " + @obj_var.name
puts "returning"
end
do_something
puts "changing num_var to 2, char_var to b, obj_var to 'kia'"
let :num_var => 2, :char_var => 'b', :obj_var => Car.new("kia") do
do_something
puts "changing num_var to 3, char_var to c, obj_var to 'toyota'"
let :num_var => 3, :char_var => 'c', :obj_var => Car.new("toyota") do
do_something
end
do_something
end
do_something

And when run, would produce this –

num_var is 1, char_var is 'a', obj_var is hyundai
returning
changing num_var to 2, char_var to b, obj_var to 'kia'
num_var is 2, char_var is 'b', obj_var is kia
returning
changing num_var to 3, char_var to c, obj_var to 'toyota'
num_var is 3, char_var is 'c', obj_var is toyota
returning
num_var is 2, char_var is 'b', obj_var is kia
returning
num_var is 1, char_var is 'a', obj_var is hyundai
returning

This could be one way to control unruly global variables in Ruby.

Turing complete languages and productivity

I was having a conversation with a colleague about varying levels of programmer productivity – specifically around language choice. As usual, one of the things that came up was the idea of Turing completeness. The important point to note is that the idea of Turing equivalence is completely separate from expressibility, efficiency, convenience, or well – productivity.

My take on it is this – for simple systems and programs – this idea of languages being equivalent approaches triviality. For non-trivial systems, however, the story changes quite a bit. If the system is sufficiently complex (and needs to be indistinguishable from magic), the best way to do it in a lower-level but “equivalent” language like C or Java is to build abstractions like crazy. Including building an interpreter for a higher level language.

I think that is the only way one can truly claim that languages are equal. Java is “equal” to Ruby because you can use JRuby to write all the code of consequence. C is infinitely superior to C++ when you embed Lua in it. Or Blub is better than Smalltalk because you implemented a Lisp in the former.

After all, what does it mean to write a system in a particular language when you can throw in a completely different layer of abstraction through a library like this? When these lines are blurring so much, and programming models and paradigms are becoming incestuous, is it not time to stop having silly debates about languages and start building systems with what works best?

P.S. – Of course, to be able to do the right thing, you need to have a toolbox with the right tools. Start collecting them, now!

More on Lisp syntax, and language extensions

Following my recent post on the topic, I thought of one more thing that the syntax of Lisp allows you to do. Being homoiconic, and the fact that code manipulation is so simple (it’s all lists), layering on “language extensions” becomes possible. For example, if Betty Programmer realizes that OO is a great way to design and write code but that Lisp by itself doesn’t provide an OO facility (there are no “class” constructs, no inheritance etc.) – she doesn’t need to despair.

She can write code to add an OOP system to the language. Yes, this means Lisp really blurs the distinction between the language designer and the programmer. In other words, while it’s fairly obvious that Lisp is very well suited to writing DSLs, it is also possible to fundamentally extend the language as well – like adding an OO system, or pattern-matching, or logic-programming (ala Prolog).

Now, obviously, I’m not proficient enough yet to do anything of this sort. But, as I said before, it is my intention to learn 🙂

Lisp. A language where being meta is something worth thinking about.

Lisp syntax, and when code is data

Like I said earlier, my friend Ravi introduced me to Lisp several years ago, but it has taken me many years to really want to learn it well enough. I’ll write about my reasons in another post. In any event, at the beginning of this year, I started to pick it up again, promising myself that I’d be serious. This time. So far so good.

I think I’ve started to grok one of the core ideas of Lisp. I had always read that the syntax of Lisp was one of its strengths. And I had always struggled with that idea, knowing it was important, yet was quite unable to really put my finger on it. I think I’m closer to it today.

If you had to create a programming language to write programs that wrote programs (as in, say DSLs) – what design choices would you make?

For one thing, you’d have to be able to generate and manipulate (walk parse-trees, compare and transform nodes etc.) code as though it were just another data-structure. Right? OK, so the code that was being generated would look like and behave like data.

You would then create an EVAL function that could run the generated code. Maybe your generated code would in turn produce generated code, so to keep things easy and simple, your language syntax would be the same as that of the generated language. In other words, you’d end up with a homoiconic programming language. Finally, you would bootstrap your language processor and arrive at your final metacircular evaluator.

To recap, this language would have syntax that looked and behaved like data and because of it could generate and manipulate that data, which itself could be code. What would this data structure look like? One obvious choice for this is a tree (because of parse-trees). If you think about it, XML is just like a tree. But it’s kludgey. What we want is something like XML but without all the cruft. For example –


<program>

<function name="add_to_stock">
<param name="counter" />
<call_function name="increment">
<argument value="counter"/>
</call_function>
</function>

<function name="remove_from_stock">
<param name="item"/>
<call_function  name="decrement_from_stock_file">
<argument value="item"/>
</call_function>
</funtion>
</program>

The syntax is truly disgusting, but useful – especially if you need to programmatically generate it. Let’s now try to make it easier for humans, too. I’m going to remove the ‘program’ tag, because all this stuff is code. I’m going to then change from XML tags to simple ‘(‘ and ‘)’ without the names – and make an assumption – the first word that appears is always a function call. Except for define – which I’ll use to denote a definition for a function. I’ll also lose the XML attribute names, assuming that words that follow the function name are always parameters (unless it’s a code block itself – which would get evaluated first). So, we’re left with –




(define (add_to_stock counter)
    (increment counter))

 (define (remove_from_stock item)
    (decrement_from_stock_file item))



Where does this leave us?

It’s the same exact XML syntax, but it’s just a bit modified and has a few rules thrown in. Importantly, it’s still as easy to generate as XML. It’s just a list of lists of words. As in, a unit of code in this format would always start and end with parenthesis, and they would enclose either a bunch of zero or more symbols, or other lists.

In fact, a language that was good at list processing and had an eval function would probably do a really good job with this stuff!