Facebook and the MMVP

Cross-posted to Zolo Labs.

I didn’t realize just how true my previous post about the minimum MVP might end up being. When we first conceived of Zolodeck, we figured we’d build it the Lean Way. In that process, we came up with an MVP – and this included support for LinkedIn, Twitter, email inboxes (GMail, etc), and possibly, Facebook.

Now, however, given our time constraints, it looks like our first (pre) alpha (is that even a real term?) users are going to start with Facebook only! Every time we started planning our initial alpha public release, our burn-down projection would show something horrendously far into the future. So we kept cutting back, and we think we have something now. Something that we’d want to use ourselves.
 
I’d have never believed myself if I’d told me I’d launch with just Facebook. But in some ways, this narrowed scope is actually better (apart from being doable in a reasonable amount of time). It will give us a slice of functionality to test, all the way through one end to the other. Our other sources of data (email, LinkedIn, Twitter, etc) are extremely valuable, but they can easily be added later. Just having Facebook means that we can focus on the value-add of our app, rather than just having data from all over.
 
In any case, we’ll see what our users will say 🙂  

Love and the minimum MVP

Cross-posted to Zolo Labs.

The start of a new project is always exciting… there’s the anticipation of awesomeness that can make one giddy 🙂 It’s the potential of the idea, the opportunity of the clean slate, the possibility of applying all learnings to date, and the hope of doing things right

Of course, a business is more than just the product. Somewhere, there’s got to be money involved. And for that, there have to be people who get enough value that they actually pay you. And that you can take in more than you spend to provide the service… 

Here’s the thing, though: if your product is in a “new” space (hasn’t everything that could be invented, already been invented?), then how do you know if you’re building something anyone cares about? This is where the minimum viable product (the much talked about MVP) comes in: the idea is that you build just enough for your early users to try out the product and give you critical feedback that would help you understand the market and its needs. Obviously, these early adopters are not the majority, so the data gathered needs to be adjusted accordingly.

So far, so good – the MVP approach looks really good for new products. For Zolodeck, for instance, we came up with what we thought was a fairly thinly sliced product. What we didn’t count on was that since Zolodeck is still a nights and weekends project (a labor of love, if you will), we don’t quite have enough time to build the MVP of our dreams. So we’re now building a minimum MVP. We’re calling it, wait for it… an MMVP. 

Given that we’re so resource constrained right now, Lean ideas really help. And the MMVP takes this to an extreme – we have to choose just those 1-2 things (maybe just 1 thing) that are the most important to test right now. We really need to think through prioritization of our feature road-map, and decide what to build next. The prioritization needs to take into account everything we can think of – product/market fit, business and technical risks, and so on.

We’ll post updates here as we make progress. For now, we’re hoping to get something into the hands of our first dozen users in the next couple of weeks. Stay tuned!

Designer-driven vs. metrics-driven product design

Cross-posted to Zolo Labs.

An interesting question cropped up over dinner the other night – does metrics-driven design have any place in the world of designer-driven product design? For instance, Apple is famous for not using focus groups and so on, and instead relies on their own processes and practices to develop their amazing products.

I’m a huge believer in Lean methods for developing products and building a business (I’ve written about this before, and spoken about it as well). I’m also a huge Apple fan-boy (yes, I wept when Jobs passed). And obviously, Apple does a lot of things right when it comes to developing products.

So how do these two ways of driving product design mesh with each other?

My point of view is that they’re not mutually exclusive. While there’s no substitute for vision (and thereby designer-driven product creation), metrics can serve as an excellent sanity check. They’re really guard-rails, guide-posts, safety-net, compass, or whatever analogy you want to use. They can be especially useful when the product is in a “new category” and where there may not be established usage patterns or user behaviors to draw from.

If you buy into the “iterations are awesome, fail-fast, learn-improve-repeat” process, then you really can’t do it without metrics. And, of course, you need the right metrics, but that can be a post for another day.

The Power of Habits

I just finished reading The Power of Habit: Why We Do What We Do in Life and Business.

It’s a great book. It has three parts – about us as individuals and our “habit loops”, about the organizations and communities we live in and how habits affect them, and finally about how our habits affect the societies we live in.

While a big part of the book explained that habits are hackable (as in, by understanding the habit loop, we can influence our habits), the most interesting part to me was about how product development can be improved from this understanding. All good products become habits – think about this and about all your favorite products.

I think this is an important take away for startups – the trick is to crack the habit loop for the product you’re building. If you can make your product a habit of your users, you’re golden.

Using messaging for scalability

I don’t understand why this whole debate about scalability is so focused on choice of the programming language.

At Runa, we provide a service to our immediate customers who are all online retailers. Their website integrates with our services in a couple of ways (via Javascript inserts and via traditional http calls) – and each time a shopper browses any of these online market-places, our services are called. Lots of times – a dozen times per retailer page-view on average.

Our load also grows in large step-functions. Each time we get a new customer, our services get called by another set of users (as all our customers’ customers get added on). We need our services to keep up with all this demand.

Finally, lets take an example of one of the things we provide via these services – dynamic pricing for products. Obviously, the response of such calls needs to be in real-time – since the price has to be shown next to the product being browsed.

So we have a load as well as a response-time concern – as most typical web-scale services do.

Our approach has been to favor simplicity – very early on we introduced a messaging backbone.

messaging_for_scalability.png

Despite the fact that this picture looks a bit more complex than it would without the RabbitMQ portion, this has allowed us to do a few things –

  • For those service calls that don’t need immediate responses – (for instance, our client websites send us data that we analyze later, or we need to send an email notification) – we just drop these onto an appropriate queue. An asynchronous processor picks up the message, and does the needful.
  • For those services that need responses immediate responses, the call is handled synchronously by one of the application servers.
  • For those services that are heavier in terms of the computation required, we split the request into pieces and have them run on separate machines. A simple federation model is used to coordinate the responses and they’re combined to return the result to the requester.
  • With the above in place, and by ensuring that each partial service handler is completely stateless, we can scale by simply adding more machines. Transparently. The same is true for all the asynchronous processors.

As an aside – I’ve written a mini-framework to help with those last couple of bits. It is called Swarmiji – and once it works (well) in production, I will open-source it. It may be useful for other folks who want a sort of message-passing based parallel programming system in Clojure.

So anyway, with this messaging system in place, we can do a lot of things with individual services. Importantly, we can try different approaches to implementing them – including trying different technologies and even languages.

IMHO, you can’t really have a conversation about scalability without context of how much load you’re talking about. When you’re just getting started with the system, this is moot – you don’t want to optimize early at all – and you can (mostly) ignore the issue altogether.

When you get to the next level – asynchronicity can get you far – and this side-steps the whole discussion of whether the language you’re using is efficient or not. Ruby is as good a choice as any in this scenario – Python or Java or most other languages will leave you in the same order of magnitude of capability. The key here is high-level design, and not doing too many obviously stupid things in the code.

When you do get crazy large (like Google or whatever), then you can start looking at ways to squeeze more out of each box – and here it may be possible that using the right language can be an important issue. I know Google even has a bunch of people working on compilers – purely to squeeze more out of the generated code . When you have tens of thousands of computers in your server farms, a 2% increase is probably worth a lot of money.

Still, this choice of language issue should be treated with caution. I’m personally of the opinion that programmer productivity is more important than raw language efficiency. That is one reason why we’re writing most of our system in a lisp (Clojure) this time around. The other thing is that these days runtimes are not what they used to be – Ruby code (and Python and Clojure and Scala) can all run on the JVM. So you can get the benefits of all those years of R&D basically for free.

Finally, a word on our messaging system. We’re using RabbitMQ – and it is *fast*, reliable and scalable. It truly is the backbone of our system – allowing us to pick and choose technology, language, and approach. It’s also a good way to minimize risk – a sub-system can be replaced with no impact to the rest of the pieces.

Anyway – take all of this with many grains of salt – after all, we’re not the size of Google (or even Twitter) – so what do I know?

HBase: On designing schemas for column-oriented data-stores

At Runa, we made an early decision to optimize. 🙂

We decided not go down the scaling a traditional database system (the one in question at the time was MySQL) route and instead to use a column-oriented data-store, that was built for scaling. After a cursory evaluation (which was done by me – and it involved a few hours of browsing the web (pseudo research),  checking email (not research),  and instant-messaging with a couple of buddies (definitely not research) – we picked HBase.

OK, so the real reason was that  we knew a few companies are using it, and that there seemed to be a little bit more of a community around it. In fact, there are a couple of HBase user-groups out here in the bay area. Someone recently asked me about CouchDB, and I’ve noticed a lot more buzz about it now that I’m watching for it… I’ve no reason why we didn’t pick it instead. They’re both Apache projects… maybe we’ll have use of CouchDB someday.

HBase schemas:

So anyway, we picked HBase. Now, having done so, and also having spent all my life using relational databases, I had no idea how to begin using HBase – the very first question sort of stumped me – what should the schema look like?

Here’s what I figured out – and I’m sure people will flame me for this – you don’t really need to design one. You just take your objects, extract their data using something like protocol buffers, yaml/json, or even (ugh!) XML – and then stick those into a single column in an HBase table.

Our simplistic HBase mapper:

We use JSON – but we’re doing something slightly different. Think of our persistable object represented by a fairly shallow JSON object – like so:

:painting => {
  :trees => [ "cedar", "maple", "oak"],
  :houses => 4,
  :cars => [
    {:make => 'honda', :model => 'fit', :license => 'ah12001'},
    {:make => 'toyota', :model => 'yaris', :license => 'xb34544'}],
  :road => {:name => '101N', :speed => 65}
}

OK, bizarre example. Still – the way you would persist this in HBase with the common approach would be to create an HBase table that had a simple schema – a single column family with one column – and you’d store the entire JSON message as text in each row. Maybe you would compress it. The row-id could be something that makes sense in your domain – I’ve seen user-ids, other object-ids, even time-stamps used. We use time-stamps for one of our core tables.

The variation we’re using is instead of storing the whole thing as a ‘blob’, we’re splitting it into columns – so that the tree-like structure of this JSON message is represented as a flat set of key-value pairs. Then, each value can be stored under a column-family:column-name.

The example above would translate to the following flat structure –

{
  "road:name" => "101N",
  "road:speed" => 65,
  "houses:" => 4,
  "trees:cedar" => "cedar",
  "trees:maple" => "maple",
  "trees:oak" => "oak",
  "cars_model:ah12001" => "fit",
  "cars_model:xb34544" => "yaris",
  "cars_make:ah12001" => "honda",
  "cars_make:xb34544" => "toyota"
}

Now it is ready to be inserted into a single row in HBase. The table we’re inserting into has the following column-families –

"road"
"houses"
"trees"
"cars_model"
"cars_make"

This can then be read back and converted into the original object easily enough.

Column-names as data:

Something to note here is that there are now column-family:column-name pairs (that together constitute the ‘column-name’) contain data themselves. An example is ‘cars_model:ah12001’ which is the name of the column, whose value is ‘fit’.

Why do we do this? Because we want the entire object-graph flattened into one row, and this allows us to do that.(what is the primary key?)

The thing to remember here is that in HBase (and others like it), each row can have any number of columns (constrained only by the column-families defined) and rows can populate values for different columns, leaving others blank. Nulls are stored for free. Coupled with the fact that HBase is optimized for millions of columns, this pattern of data-modeling becomes feasible. In fact you could store tons of data in any row in this manner.

Final comments:

If you’re always going to operate on the full object graph, then you don’t really need to split things up this way – you could use one of the options described above (xml, json, or protocol buffers). If different clients of this data typically need only a subset of the object graph (say only the car models or only the speed limits of roads, or some such combination), then with this data-splitting approach, they could only load up the required columns.

This idea of using columns as data take a little getting used to. I’m sure there are better ways of using such powerful data-stores – but this is the approach we’re taking right now, and it seems to be working for us so far.

Clojure HBase helper:

I’ve written some Clojure code that helps this transformation back and forth: hash-tables into/out of HBase. It is open-source – and once I clean it up, I will write about it.

Hope this stuff helps – and if I’ve described something stupid, then please leave a correction. Thanks!

P. S. – I met a Googler today who said BigTable (and by inference HBase) is not a column-oriented database. I think that is incorrect – at least according to wikipedia. I read it on the Internet, I must be right 🙂

Startup logbook – v0.1 – Clojure in production

Late last night, around 3 AM to be exact, we made one of our usual releases to production (unfortunately, we’re still in semi-stealth mode, so I’m not talking a lot about our product yet – I will in a few weeks). There was nothing particularly remarkable about the release from a business point of view. It had a couple of enhancements to functionality, and a couple of bug-fixes.

What was rather interesting, at least to the geek in me, was that it was something of a technology milestone for us. It was the first production release of a new architecture that I’d been working on over the past 2-3 weeks. Our service contains several pieces, and until this release we had a rather traditional architecture – we were using ruby on rails for all our back-end logic (eg. real-time analytics and pricing calculations) as well as the user-facing website. And we were using mysql for our storage requirements.

This release has seen our production system undergo some changes. Going forward, the rails portion will continue to do what it was originally designed for – supporting a user-facing web-UI. The run-time service is now a combination of rails and a cluster of clojure processes.

When data needs to be collected (for further analytics down the line), the rails application simply drops JSON messages on a queue (we’re using the excellent erlang-based RabbitMQ), and one of a cluster of clojure processes picks it up, processes it, and stores it in an HBase store. Since each message can result in several actions that need to be performed (and these are mostly independent), clojure’s safe concurrency helps a lot. And since its a lisp, the code is just so much shorter than equivalent ruby could ever be.

Currently, all business rules, analytics, and pricing calculations are still being handled by the ruby/rails code. Over the next few releases we’re looking to move away from this – to instead let the clojure processes do most of the heavy lifting.

We’re hoping we can continue to do this in a highly incremental fashion, as the risk of trying to get this perfect the first time is too high. We absolutely need to get the feedback that only production can give us – so we’re more sure that we’re building the thing right.

The last few days have been the most fun I’ve had in any job so far. Besides learning clojure, and hadoop/ hbase pretty much at the same time (and getting paid for doing that!), it has also been a great opportunity to do this as incrementally as possible. I strongly believe in set-based engineering methods, and this is the approach I took with this as well – currently, we haven’t turned off the ruby/rails/mysql system – it is doing essentially the same thing that the new clojure/hbase system is doing. We’re looking to build the rest of the system out (incrementally), ensure it works (and repeat until it does) – before turning off the (nearly legacy) ruby system.

I’ll keep posting as I progress on this front. Overall, we’re very excited at the possibilities that using clojure represents – and hey, if it turns out to be a mistake – we’ll throw it out instead.