Column Stores: Teaching an Old Elephant New Tricks

Prof. Michael Stonebraker is a controversial visionary, who is known for nothing less than Ingres, Postgres, Vertica, Streambase, Illustra, VoltDB, SciDB, besides being a renowned MIT professor. My recent blog post about Stonebraker’s talk at the EPFL (host university to Prof. Martin Odersky, creator of the Scala Language and Co-Founder of Typesafe) has triggered a very interesting discussion on reddit.

While Stonebraker is very sure about his obviously biased claims that “The Traditional RDBMS Wisdom is All Wrong”, the bottom line of the reddit discussion included:

Interesting insight on SQL Server’s enhancement can be seen in this blog post by Microsoft’s Nicolas Bruno, who challenges the fact that column stores cannot be implemented by “traditional RDBMS”. As Nicolas Bruno stated, an “Old Elephant” can be taught new tricks. “Traditional RDBMS” have proven to adapt to long-term trends in the database industry. Their success isn’t based around the fact that they are mainly fast, or particularly well-designed to respond to niche problem domains. Their success is mainly based on the fact that they are designed according to Codd’s 12 Rules, and thus to be extremely flexible in how they separate data interfacing (SQL) from data storage.

A lot of additional insight and ongoing links can be found in these blog posts by Daniel Lemire, where he had challenged Stonebraker’s similar claims already four years ago:

6 thoughts on “Column Stores: Teaching an Old Elephant New Tricks

  1. Just to clarify something important …

    Stonebraker wasn’t making claims for Object DataBase Management Systems (http://en.wikipedia.org/wiki/Object_database). He was arguing for and building Object-Relational DataBase Management Systems (http://en.wikipedia.org/wiki/Object-relational). Measured in terms of their respective features, ODBMSs are almost extinct, and ORDBMSs are dominant paradigm today.

    Take a glance inside the book we wrote ( http://www.amazon.com/Object-Relational-Edition-Kaufmann-Management-Systems/dp/1558604529 ) and the table of contents lists the principle features of an ORDBMS; user-defined types and functions that extend the SQL declarative query language. At the time ( 1995 ), these were controversial ideas. The more “popular” approach was to add persistent features to Object-Oriented programming languages. And people complained about the technical approach Postgres took. But today, all SQL DBMSs come with the Object-Relational features Stonebraker was advocating. In fact, they’re enshrined in the SQL standard. Hardly a dodgy prediction!

    If you want to highlight something he got wrong, there are better examples. The Mariposa distributed DBMS was a bust (centralized warehouses won) and he’s famous in the groves of the academy for a 1978 paper that basically claimed B-Trees were rubbish.

  2. I’m sorry but actual usage those features is really not very common. In my opinion the “Object relational” tag is just leftover marketing hooha from a passing fad. They keep the terminology because there is still an element of “object” = “good” in the culture, but mostly no one really cares.

    And user defined types (Codd called them “domains”) are nothing new to the relational model.

    So I don’t buy that “ORDBMSs are dominant paradigm” today. It’s not a paradigm, its just a buzzword associated with a handful of borderline features that haven’t really had that much real impact. And it wasn’t foresight so much as salesmanship.

    1. So you don’t like the label, “Object-relational”? Fair enough.

      And you’re quite right that “domains”–which I would point out are mentioned only once in the original “Data Banks” paper and scarcely troubled anyone for two decades there-after–provide the theoretical framework within which this extensibility works. The point is that implementing “domains” properly required some rather major renovation to the way a DBMS is implemented, and it was the Postgres group who figured those details out.

      And just because you haven’t used the features much, it doesn’t mean no one else has. Just search on “user-defined type” and you’ll find all the very many technical articles and columns that describe these features and explore their use in various DBMSs. You don’t like the word “paradigm”, either? Well – OK. Let’s jut say “DBMS features developed as part of the Postgres project now enjoy widespread industry support” and leave it at that.

      All of which suggests, like I said, that this was “Hardly a dodgy prediction!”

      1. If you evangelize something, you don’t get credit for prediction when (some it, to some degree) comes to pass.

Leave a Reply