Monthly Archives: April 2011

LMAX: Going Slow to Go Fast

GDE Error: Error retrieving file — if necessary turn off error checking (404:Not Found)

Scalable web architects must learn from the transactional world, stock exchanges and other financial creatures of mass data processing. For example, LMAX attaining 100K+ TPS at less than 1 ms latency is a remarkable technical feat. Sure, other exchanges like NYSE and LSE manage to achieve higher TPS at lower latencies, however it’s the history behind LMAX what makes it a fascinating object of study: I’ve estimated from public sources that 20 people worked fully committed for three years to develop the initially released version of LMAX. At first, a simple proof of concept reaching 10K TPS was produced, followed by a long number recode-measure-debug cycles that felt more like squeezing and juicing the JVM to achieve significant speed-ups than real programming, because writing cache-friendly code for an adaptive-optimizing JIT virtual machine with no control of how data structures are mapped to memory is really hard, as nonlinearities appear everywhere in the typical code optimization process.

It’s the same kind of technical debt Google experienced when it started with a Java codebase, then migrated to a slower Python one to finally settle for C/C++; or the current technical debt at Twitter, a pure Ruby on Rails product that moved to Java and Scala with phenomenal results.

When frameworks and virtual machines get in the way, it’s the good old wisdom from people like Ken Thompson that illuminates the path to success: “One of my most productive days was throwing away 1000 lines of code.”