Daily Archives: August 24, 2009

The NoSQL Movement

Now there is a new community called NoSQL and they   (150 enthusiastic developers of the cloud era)   met for the first time  last June in San Francisco.

“Relational databases give you too much. They force you to twist your object data to fit a RDBMS [relational database management system],” said Jon Travis, principal engineer at Java toolmaker SpringSource.  NoSQL-based alternatives “just give you what you need,” Travis said.

So get used to names like Dynomite,  CassandraDB, Voldemort, CouchDB, BigTable, MongoDB, Hypertable, SimpleDB, etc. These are all members of this club which was started by Google’s BigTable and its clones such as Hypertable (ZVent). Amazon is doing SimpleDB on Dynamo (not called a database, but a highly available key-value data store). Facebook is doing CassandraDB. Apache CouchDB is a free, open source, document-oriented database written in Erlang programming language. MongoDB is a collection of JSON documents ( no rows or columns), an open source document-oriented DB written in C++ programming language.

So what are the common factors pushing this movement?

  • They can blow through enormous volumes of data. For example, Google’s BigTable with its sister technology MapReduce processes as much as 20 petabytes of data per day. We have not seen this volume in RDBMSs.
  • They run on clusters of cheap PC Servers. Google has said that one of BigTable’s bigger clusters manages as much as 6 petabytes of data across thousands of servers. Oracle’s RAC (Rapid Application Cluster) can get there but at a much higher cost.
  • They beat performance bottlenecks.  The phrase used here is “eventually consistent”, trading off consistency to maximize availability and scalability.
  • While conceding that relational databases offer an unparalleled feature set and a rock-solid reputation for data integrity, NoSQL proponents say this can be too much for their needs. Hence the mantra is “no overkill”.

I think this is exciting development. When people like me worked on early days of RDBMS, we could not imagine the kind of scalability and data volumes being talked about now. It’s only natural that new approaches must be innovated to handle the demands of the Internet era. Eric Brewer of UC Berkeley floated the idea of this in his research work at least 6-7 years ago.

Although NoSQL movement is not a threat to mainstream database community yet, this may change in next 3-4 years time.