s-expressions

Amit Rathore blogs about software development

Posts Tagged ‘code’

Pretty-printing in Clojure logs

Posted by Amit Rathore on January 15, 2013

Cross-posted from Zolo Labs.

Logging is an obvious requirement when it comes to being able to debug non-trivial systems. We’ve been thinking a lot about logging, thanks to the large-scale, distributed nature of the Zolodeck architecture. Unfortunately, when logging larger Clojure data-structures, I often find some kinds of log statements a bit hard to decipher. For instance, consider a map m that looked like this:

When you log things like m (shown here with println for simplicity), you may end up needing to understand this:

Aaugh, look at that second line! Where does the data-structure begin and end? What is nested, and what’s top-level? And this problem gets progressively worse as the size and nested-ness of such data-structures grow. I wrote this following function to help alleviate some of the pain:

Remember to include clojure.pprint. And here’s how you use it:

That’s it, really. Not a big deal, not a particularly clever function. But it’s much better to see this structured and formatted log statement when you’re poring over log files in the middle of the night.

Just note that you want to use this sparingly. I first modified things to make ALL log statements automatically wrap everything being logged with pp-str: it immediately halved the performance of everything. pp-str isn’t cheap (actually, pprint isn’t cheap). So use with caution, where you really need it!

Now go sign-up for Zolodeck!

Posted in Uncategorized | Tagged: , , , | Leave a Comment »

Why Java programmers have an advantage when learning Clojure

Posted by Amit Rathore on December 20, 2012

Cross-posted from Zolo Labs.

There is a spectrum of productivity when it comes to programming languages. I don’t really care to argue how much more productive dynamic languages are… but for those who buy that premise and want to learn a hyper-productive language, Clojure is a good choice. And for someone who has a Java background, the choice Clojure becomes the best one. Here’s why:

  • Knowing Java – obviously useful: class-paths, class loaders, constructors, methods, static methods, standard libraries, jar files, etc. etc.
  • Understanding of the JVM – heap, garbage collection, perm-gen space, debugging, profiling, performance tuning, etc.
  • The Java library ecosystem – what logging framework to use? what web-server? database drivers? And on and on….
  • The Maven situation – sometimes you have to know what’s going on underneath lein
  • Understanding of how to structure large code-bases – Clojure codebases also grow
  • OO Analysis and Design – similar to figuring out what functions go where

I’m sure there’s a lot more here, and I’ll elaborate on a few of these in future blog posts.

I’ve not used Java itself in a fairly long time (we’re using Clojure for Zolodeck). Even so, I’m getting a bit tired of some folks looking down on Java devs, when I’ve seen so many Clojure programmers struggle from not understanding the Java landscape.

So, hey Java Devs! Given that there are so many good reasons to learn Clojure – it’s a modern LISP with a full macro system, it’s a functional programming language, it has concurrency semantics, it sits on the JVM and has access to all those libraries, it makes a lot of sense for you to look at it. And if you’re already looking at something more dynamic than Java itself (say Groovy, or JRuby, or something similar), why not just take that extra step to something truly amazing? Especially when you have such an incredible advantage (your knowledge of the Java ecosystem) on your side already?

Posted in Uncategorized | Tagged: , , , , | 3 Comments »

Clojure utility functions – part II

Posted by Amit Rathore on December 3, 2012

Cross-posted from Zolo Labs

 

Here’s another useful function I keep around:

Everyone knows what map does, and what concat does. And what mapcat does. 

The function definition for pmapcat above, does what mapcat does, except that by using pmap underneath, it does so in parallel. The semantics are a bit different: first off, the first parameter is called batches (and not, say, coll, for collection). This means that instead of passing in a simple collection of items, you have to pass in a collection of collections, where each is a batch of items. 

Correspondingly, the parameter f is the function that will be applied not to each item, but to each batch of items.

Usage of this might look something like this:

One thing to remember is that pmap uses the Clojure send-off pool to do it’s thing, so the usual caveats will apply wrt to how f should behave.

Posted in Uncategorized | Tagged: , , , | Leave a Comment »

Clojure utility functions – part I

Posted by Amit Rathore on November 22, 2012

Cross-posted from Zolo Labs.

 

I kept using an extra line of code for this, so I decided to create the following function:

https://gist.github.com/4123530

Another extra line of code can similarly be removed using this function:

https://gist.github.com/4123531

Obviously, the raw forms (i.e. using doseq or map) can be far more powerful when used with more arguments. Still, these simple versions cover 99.9% of my use-cases.

I keep both these (and a few more) in a handy utils.clojure namespace I created for just such functions.

Posted in Uncategorized | Tagged: , , , , | Leave a Comment »

Make it right, then make it fast

Posted by Amit Rathore on November 5, 2012

Cross-posted to Zolo Labs

Alan Perlis once said: A Lisp programmer knows the value of everything, but the cost of nothing.

I re-discovered this maxim this past week. 

As many of you may know, we’re using Clojure, Datomic, and Storm to build Zolodeck. (I’ve described my ideal tech stack here). I’m quite excited about the leverage these technologies can provide. And I’m a big believer in getting something to work whichever way I can, as fast as I can, and then worrying about performance and so on. I never want to fall under the evil of premature optimization and all that… In fact, on this project, I keep telling my colleague (and everyone else who listens) how awesome (and fast) Datomic is, and how its built-in cache will make us stop worrying about database calls. 

A function I wrote (that does some fairly involved computation involving relationship graphs and so on) was taking 910 seconds to complete. Yes, more than 15 minutes. Of course, I immediately suspected the database calls, thinking my enthusiasm was somehow misplaced or that I didn’t really understand the costs. As it turned out, Datomic is plenty fast. And my algorithm was naive and basically sucked… I had knowingly  glossed over a lot of functions that weren’t exactly performant, and when called within an intensive set of tight loops, they added up fast.

After profiling with Yourkit, I was able to bring down the time to about 900 ms. At nearly a second, this is still quite an expensive call, but certainly less so than when it was ~ 1000x slower earlier.

I relearnt that tools are great and can help in many ways, just not in making up for my stupidity :-)

Posted in Uncategorized | Tagged: , , , , | Leave a Comment »

demonic v0.1 – utilities for Datomic

Posted by Amit Rathore on April 18, 2012

Announcing a new blog: blog.zolodeck.com. Just wrote the first post on my work with Datomic. I’ve put some of it into a project called demonic, and hopefully, you’ll find it of some use!

Posted in Uncategorized | Tagged: , , , , | Leave a Comment »

calling recur from catch or finally

Posted by Amit Rathore on August 15, 2010

Clojure doesn’t have tail recursion, but does support the recur form. Let’s take a quick look at how it’s used. Consider a function that sums up a list of numbers to an accumulator:

(defn add-numbers [acc numbers]
  (if (empty? numbers)
    acc
    (add-numbers (+ acc (first numbers)) (rest numbers))))

Lets ignore all the ways this can be done without the silly implementation above. Here it is in action:

user> (add-numbers 10 (range 10))
55

And here’s the problem with it:

user> (add-numbers 10 (range 10000))
; Evaluation aborted.
No message.
  [Thrown class java.lang.StackOverflowError]

The reason, of course, is that being a self-recursive function that calls itself explicitly, it blows the stack. Clojure has a way to get around this, via the recur form:

(defn add-numbers [acc numbers]
  (if (empty? numbers)
    acc
    (recur (+ acc (first numbers)) (rest numbers))))

And here is proof that it works:

user> (add-numbers 10 (range 10000))
49995010

Now, let’s look at a case where one might want to recurse from inside a catch or finally block. A use-case is a function like connect-to-service, that must retry the connection if the service is unavailable. An easy way to implement it is to catch the exception thrown when the attempt at connecting fails, then wait a few seconds, and try again by recursing. Here’s a contrived example of a function that recurs from catch:

(defn catch-recurse [n i]
  (try
    (if (> n i)
      (/ i 0)
      n)
    (catch Exception e
      (recur n (inc i)))))

The problem, of course, is that Clojure complains:

Cannot recur from catch/finally
  [Thrown class java.lang.UnsupportedOperationException]

So what to do? One way is to make the call explicitly, and hope that it won’t blow the stack:

(defn catch-recurse [n i]
  (try
    (if (> n i)
      (/ i 0)
      n)
    (catch Exception e
      (catch-recurse n (inc i)))))

It could blow the stack, though, depending:

user> (catch-recurse 100 1)
100
user> (catch-recurse 10000 1)
; Evaluation aborted.
No message.
  [Thrown class java.lang.StackOverflowError]

As pointed out, this may blow the stack, but it may not, depending on your situation. If you know it won’t, then this may be OK. Here’s a way to avoid this situation completely, using trampoline. First, a minor change to catch-recurse:

(defn catch-recurse [n i]
  (try
    (if (> n i)
      (/ i 0)
      n)
    (catch Exception e
      #(catch-recurse n (inc i)))))

Notice that in the case of an exception, we return a thunk. Now, to use our new function:

user> (trampoline catch-recurse 100 1)
100
user> (trampoline catch-recurse 10000 1)
10000

And there you have it. The common use-case of trampoline is to handle mutually recursive functions where recur isn’t useful. It checks to see if the return value of the function it’s passed in is another function. If so, it calls it. It repeats the process until a non-function value is returned, which it then itself returns. Very useful!

Posted in Uncategorized | Tagged: , , , | Leave a Comment »

Medusa 0.1 – a supervised thread-pool for Clojure futures

Posted by Amit Rathore on June 8, 2010

Clojure comes with two kinds of thread-pools – a bounded thread-pool for CPU-bound operations, and one for IO-bound operations that grows as needed. The bounded thread-pool is used every time an action is sent to an agent via the send function. The unbounded thread-pool is used (for instance) every time an action is sent to an agent using the send-off function. Futures also run on this unbounded thread-pool.

Sometimes, however, you might need a third option. This is the case where you don’t want an unbounded pool of threads that grows so much that the system runs out of resources trying to juggle the sheer number of threads. This might happen (say) if you were using send-off to handle incoming requests for IO-bound operations. Under normal circumstances, such a system might perform in an acceptable manner. If the request load were to spike, however, you could quickly create a larger-than-manageable number of threads.

What you need in such a case is a separate thread-pool for IO operations – one that has more threads than the one in the thread-pool for CPU-bound operations, but still bound so that it only grows to a certain size, and then any further requests get queued. Luckily, Clojure allows you to seamlessly use underlying Java libraries.

Medusa is a bounded, supervised thread-pool. A supervisor function runs alongside the thread-pool and it monitors the running tasks. If they take more than a specified amount of time, they are evicted. If the thread-pool is fully occupied, Medusa will queue all further tasks submitted and will run each task as soon as a thread becomes available. The Medusa thread-pool size is thrice the number of cores available to the JVM. In future versions, this number will be configurable.

Here it is in action -

(use 'org.rathore.amit.medusa.core)

(start-supervisor)

(defn new-task [id sleep-seconds]
  (println (System/currentTimeMillis) "| Starting task" id "will sleep for" sleep-seconds)
  (Thread/sleep (* 1000 sleep-seconds))
  (println (System/currentTimeMillis) "| Done task" id))

(defn run-tasks [n]
  (println "Will submit" n "jobs")
  (dotimes [i n]
    (medusa-future i #(new-task i (* 5 (inc i))))))

(run-tasks 20)

The output is -

Will submit 20 jobs
1276068494442 | Starting task 0 will sleep for 5
1276068494448 | Starting task 1 will sleep for 10
1276068494449 | Starting task 2 will sleep for 15
1276068494449 | Starting task 3 will sleep for 20
1276068494451 | Starting task 4 will sleep for 25
1276068494451 | Starting task 5 will sleep for 30
1276068499447 | Done task 0
1276068499448 | Starting task 6 will sleep for 35
1276068504448 | Done task 1
1276068504448 | Starting task 7 will sleep for 40
1276068509448 | Done task 2
1276068509448 | Starting task 8 will sleep for 45
1276068514448 | Done task 3
1276068514449 | Starting task 9 will sleep for 50
1276068519450 | Starting task 10 will sleep for 55
1276068523547 | Starting task 11 will sleep for 60
1276068523548 | Starting task 13 will sleep for 70
1276068523547 | Starting task 12 will sleep for 65
1276068523548 | Starting task 14 will sleep for 75
1276068523548 | Starting task 15 will sleep for 80
1276068523549 | Starting task 16 will sleep for 85
1276068533547 | Starting task 17 will sleep for 90
1276068533547 | Starting task 18 will sleep for 95
1276068543547 | Starting task 19 will sleep for 100

Notice that the first few tasks complete, since the pre-emption time is 20 seconds. The rest of the tasks get pre-empted out of the thread-pool by the supervisor since they take too long (simulated above by the sleeps). Since all the later tasks have been coded to take more than 20 seconds, they will all get pre-empted. The Medusa thread-pool is then ready for more tasks. This pre-emption is what allows the other tasks to start, as can be seen by looking at the timestamps of the log messages. This fulfills the requirement that we have a bounded-threadpool with supervised pre-emption of tasks that take too long.

Here’s the thread-usage when the program starts, and the supervisor has started:

Thread-pool when the program starts

Here’s the thread-usage when the tasks complete:

Thread-pool when the tasks complete

The semantics are still not of the standard Clojure futures – currently, Medusa “futures” only handle side-effects. A next step would be to give them the same future semantics so that they return the result of their computation – that will come in the next version.

The project is hosted on github, as usual – http://github.com/amitrathore/medusa. Click here to see the basic implementation.

Posted in Uncategorized | Tagged: , , , , , | 4 Comments »

conjure – simple mocking and stubbing for Clojure unit-tests

Posted by Amit Rathore on January 24, 2010

Siva and I were pairing on a unit-test that involved writing something to HBase. When Siva said that mocking the call to the save-to-hbase function would make testing easier (a simple thing using JMock, he said), I decided to write a quick mocking utility for Clojure.

Then later, we realized that we wanted to go one step further. The row-id that was used as the key to the object in HBase was generated using system-time. That meant that even if we wanted to confirm that the object was indeed saved, we had no way of knowing what the row-id was. One solution to such a problem is to inject the row-id in (instead of being tightly coupled to the function that generated the row-id). Instead, I wrote a stubbing utility that makes this arbitrarily easy to do.

So here they are – mocking and stubbing – packaged up as the conjure project on github.

The set up

Imagine we had the following functions -

(defn xx [a b]
  10)

(defn yy [z]
  20)

(defn fn-under-test []
  (xx 1 2)
  (yy  "blah"))

(defn another-fn-under-test []
  (+ (xx nil nil) (yy nil)))

Also imagine that we had to test fn-under-test and another-fn-under-test, and we didn’t want to have to deal with the xx or yy functions. Maybe they’re horrible functions that open connections to computers running Windoze or something, I dunno.

Mocking

Here’s how we might mock them out -

(deftest test-basic-mocking
  (mocking [xx yy]
    (fn-under-test))
  (verify-call-times-for xx 1)
  (verify-call-times-for yy 1)
  (verify-first-call-args-for xx 1 2)
  (verify-first-call-args-for yy "blah"))

Pretty straightforward, eh? You just use the mocking macro, specifying all the functions that need to be mocked out. Then, within the scope of mocking, you call your functions that need to be tested. The calls to the specified functions will get mocked out (they won’t occur), and you can then use things like verify-call-times-for and verify-first-call-args-for to ensure things worked as expected.

Stubbing

As mentioned in the intro to this post, sometimes your tests need to specify values to be returned by the functions being mocked out. That’s where stubbing comes in. Here’s how it works -

(deftest test-basic-stubbing
  (is (= (another-fn-under-test) 30))
  (stubbing [xx 1 yy 2]
    (is (= (another-fn-under-test) 3))))

So that’s it! Pretty simple. Note how within the scope of stubbing, xx returns 1 and yy returns 2. Now, for the implementation.

Implementation

The code is almost embarrassingly straight-forward. Take a look -

(ns org.rathore.amit.conjure.core
  (:use clojure.test))

(def call-times (atom {}))

(defn stub-fn [function-name return-value]
  (swap! call-times assoc function-name [])
  (fn [& args]
    (swap! call-times update-in [function-name] conj args)
    return-value))

(defn mock-fn [function-name]
  (stub-fn function-name nil))

(defn verify-call-times-for [fn-name number]
  (is (= number (count (@call-times fn-name)))))

(defn verify-first-call-args-for [fn-name & args]
  (is (= args (first (@call-times fn-name)))))

(defn verify-nth-call-args-for [n fn-name & args]
  (is (= args (nth (@call-times fn-name) (dec n)))))

(defn clear-calls []
  (reset! call-times {}))

(defmacro mocking [fn-names & body]
  (let [mocks (map #(list 'mock-fn %) fn-names)]
    `(binding [~@(interleave fn-names mocks)]
       ~@body)))

(defmacro stubbing [stub-forms & body]
  (let [stub-pairs (partition 2 stub-forms)
        fn-names (map first stub-pairs)
        stubs (map #(list 'stub-fn (first %) (last %)) stub-pairs)]
    `(binding [~@(interleave fn-names stubs)]
       ~@body)))

It’s just an hour or so of work, so it’s probably rough, and certainly doesn’t support more complex features of other mocking/stubbing libraries. But I thought the simplicity was enjoyable.

Posted in Uncategorized | Tagged: , , , , , | 6 Comments »

frumiOS – a simple object-system for Clojure

Posted by Amit Rathore on December 10, 2009

I’ve nearly stopped blogging, because all my spare time goes into writing Clojure in Action. But I was a bit bored this weekend, and wrote this little library that can be used to write traditional looking Object-Oriented (TM) code in Clojure.

Why would you do that, when you can use a rifle-oriented programming style instead? Think of it like using the rifle as a club… On the other had, the implementation makes plenty use of closures and macros, so it is probably a rifle-oriented program :-)

The implementation is hosted on github, in a project called frumios. And if you reall want to see it now, click below.

(ns org.rathore.amit.frumios.core)
 
(declare new-object find-method) 
 
(defn new-class [class-name parent methods]
  (let [klass ((comp resolve symbol name) class-name)]
    (fn [command & args]
      (cond
	(= :parent command) parent
	(= :name command) klass
	(= :method-names command) (keys methods)
	(= :methods command) methods
	(= :new command) (new-object klass)
	(= :method command) 
          (let [[method-name] args]
	    (find-method method-name methods parent))
	:else (throw (RuntimeException. (str "Unknown message: " command)))))))
 
(def OBJECT (new-class :o rg.rathore.amit.frumios.core/OBJECT nil {}))
(def this)
 
(defn new-object [klass]
  (let [state (ref {})]
    (fn thiz [command & args]
      (cond
        (= :class command) klass
        (= :set! command) (let [[k v] args]
			    (dosync (alter state assoc k v))
			    nil)
        (= :get command) (let [[key] args]
			   (state key))
        :else (let [method (klass :method command)]
		(if method 
		  (binding [this thiz]
		    (apply method args))))))))
 
(defn find-method [method-name instance-methods parent-class]
  (let [method (instance-methods method-name)]
    (or method
	(if-not (= #'org.rathore.amit.frumios.core/OBJECT parent-class)
	  (find-method method-name (parent-class :methods) (parent-class :parent))))))
 
(defn parent-class-spec [sexprs]
  (let [extends-spec (filter #(= :extends (first %)) sexprs)
        extends (first extends-spec)]
    (if (empty? extends)
      'org.rathore.amit.frumios.core/OBJECT
      (do 
	(if-not (= 1 (count extends-spec))
	  (throw (RuntimeException. "defclass only accepts a single extends clause")))
	(if-not (= 2 (count extends))
	  (throw (RuntimeException. "the extends clause only accepts a single parent class")))
	(last extends)))))
 
(defn method-spec [sexpr]
  (let [name (keyword (second sexpr))
	remaining (next sexpr)]
    {name (conj remaining 'fn)}))
 
(defn method-specs [sexprs]
  (let [method-spec? #(= 'method (first %))
	specs (filter method-spec? sexprs)]
    (apply merge (map method-spec sexprs))))
 
(defmacro defclass [class-name & specs]
  (let [parent-class-symbol (parent-class-spec specs)
        this-class-name (keyword class-name)
	fns (method-specs specs)]
    `(def ~class-name 
        (new-class ~this-class-name (var ~parent-class-symbol) ~(or fns {})))))

But first, examples -

(ns frumios-spec)

(use 'org.rathore.amit.frumios.core)

(defclass animal
  (method sound []
    "grr")

  (method say-something []
    (str (this :sound) ", I say!"))

  (method move []
    "going!"))

(defclass cat
  (:extends animal)

  (method sound []
    "meow"))

There, that defines a simple class hierarchy. Let’s examine these classes -

frumios-spec> (cat :parent)
#'frumios-spec/animal

frumios-spec> (animal :parent)
#'org.rathore.amit.frumios.core/OBJECT

frumios-spec> (animal :method-names)
(:move :say-something :sound)

frumios-spec> (cat :method-names)
(:sound)

Now, let’s define a couple of instances -

(def a (animal :new))
(def c (cat :new))

What can we do with these instances? Let’s explore -

frumios-spec> (c :class)
#'frumios-spec/cat

frumios-spec> (c :set! :name "Mr. Muggles")
nil

frumios-spec> (c :get :name)
"Mr. Muggles"

That’s the basic stuff, how about calling methods?

frumios-spec> (a :move)
"going!"

frumios-spec> (a :sound)
"grr"

frumios-spec> (c :sound)
"meow"

Notice how cat overrides the sound method. OK, how about a method that calls another method? It calls for the this keyword. Here it is in action -

frumios-spec> (a :say-something)
"grr, I say!"

frumios-spec> (c :say-something)
"meow, I say!"

Notice how in the second call, (this :sound) resolved itself to the overridden sound method in the cat class. That’s subtype polymorphism, common to languages such as Java and Ruby. We could use it to implement something like the template pattern. We can do fairly arbitrary things with frumiOS -

(defclass person
  (method greet [visitor]
    (println "Hi" visitor ", I'm here!"))

  (method dob []
    (str "I was born on " (this :get :birth-date)))

  (method age []
    2)

  (method experience [years]
    (str years " years"))

  (method bio []
    (let [msg (str (this :dob) ", and have " (this :experience (this :age)) " of experience.")]
      (println msg))))

Let’s play with it -

frumios-spec> (def kyle (person :new))
#'frumios-spec/kyle

frumios-spec> (kyle :greet "rob")
Hi rob , I'm here!
nil

The bio method makes two calls using the this construct, one nested inside the other. It works as expected -

frumios-spec> (kyle :set! :birth-date "1977-01-01")
nil

frumios-spec> (kyle :bio)
I was born on 1977-01-01, and have 2 years of experience.
nil

So there it is. I’m sure it doesn’t do lots of stuff a real object-system does. But at 70 lines of Clojure code, you can’t expect a whole lot more. Silly as this is, I had fun writing it! Click here to see how the frumiOS is implemented.

Posted in Uncategorized | Tagged: , , , , | 4 Comments »

 
Follow

Get every new post delivered to your Inbox.

Join 1,586 other followers