Analytics

Monday, September 23, 2013

Erlang and Riak

Today I learned that Riak, a distributed database based on Amazon's Dynamo, is written in Erlang. This is really cool because I personally have some experience with Apache Cassandra, which is another Dynamo-like distributed database written in Java. Specifically, when I worked with Cassandra, I encountered multiple concurrency bugs that were very difficult to track down and sometimes quite insidious. That's not to fault the writers of Cassandra; it is a complex, multithreaded system that has a lot of moving parts, which is exactly the type of code that led me to write about the difficulty of multithreading. Being written in Erlang, however, Riak should be able to escape some of the reliability issues that arise when dealing with shared, mutable state in Java. I am particularly curious if Riak is more stable than Cassandra, but this information seems hard to come by.

I did stumble across this blog post evaluating the choice of Erlang for Riak five years later. The folks at Basho (company behind Riak) seem to be happy with it, and their rationale for choosing Erlang lines up pretty well with my expectations. Two of the more interesting benefits they describe are that "a 'many-small-heaps' approach to garbage collection would make it easier to build systems not suffering from unpredictable pauses in production" and "the ability to inspect and modify a live system at run-time with almost no planning or cost." The former hits home again with my Cassandra experience, as Java garbage collection on a 16GB heap is not nearly as predictable as one would hope, and the latter is certainly valuable for fast-paced startups and the (in)famous "move fast, break things" philosophy. In the end, I am just happy to see that there are startups out there who have sought alternatives to the concurrent programming hell and built some neat systems as a result of it.

No comments:

Post a Comment