Cross-posted from Zolo Labs
Here’s another useful function I keep around:
(defn pmapcat [f batches] | |
(->> batches | |
(pmap f) | |
(apply concat) | |
doall)) |
Everyone knows what map does, and what concat does. And what mapcat does.
The function definition for pmapcat above, does what mapcat does, except that by using pmap underneath, it does so in parallel. The semantics are a bit different: first off, the first parameter is called batches (and not, say, coll, for collection). This means that instead of passing in a simple collection of items, you have to pass in a collection of collections, where each is a batch of items.
Correspondingly, the parameter f is the function that will be applied not to each item, but to each batch of items.
Usage of this might look something like this:
(defn handle-batch [batch] | |
(blah blah…)) | |
(->> coll | |
(partition-all n) | |
(pmapcat handle-batch)) |
One thing to remember is that pmap uses the Clojure send-off pool to do it’s thing, so the usual caveats will apply wrt to how f should behave.