This started out as a page for Hadoop-related stuff, but my feeling is that there’s a lot more interesting stuff out there than “just” Hadoop, so I’ll eventually be adding generic Big Data resources.

Date Link Notes
2014-06 Kafka A distributed messaging system with commit logs
2014-04 Druid A clustered column store able to ingest and query data on the fly
PigPen A Clojure library for orchestrating complex Hadoop jobs.
2013-10 H2O An analytics/machine learning toolkit