Typesafe’s Slick is Not About SQL

We have stumbled upon an interesting thread on the Typesafe SLICK user group where Slick was compared to jOOQ. In that thread, Christopher Vogt has made a couple of interesting statements.

But let us have a look at the broader context, first.

Unifying Stuff

Ever since the proclamation of UDDI or RUP, we may think that the U for Unified is a clear and unmistakable indicator for what Joel Spolsky would call architecture astronautitis. In case you’ve missed those hilarious posts, here they are:

Today, many software vendors are again trying to unify database query languages. Erik Meijer’s LINQ was the most successful attempt at doing so, so far. But even LINQ doesn’t compare to Codd’s visions, which were about replacing the whole data model first by a rock-solid mathematical theory, and only then, thinking about appropriate languages to query such data models.

Flexible vs. rigid abstractions

We believe that unifying query languages to query RDBMS, XML, Objects, and NoSQL is a bad idea because such a unification is subject to either:

  • being a flexible abstraction
  • being a rigid abstraction

If an abstraction is flexible, then the heterogeneous implementation details of the abstracted data stores will inevitably leak into the query language and into your application. You don’t gain too much, for the price of adding more layers and boiler-plate.

Geek and Poke's Footprints - Licensed CC-BY 2.0
Geek and Poke’s Footprints – Licensed CC-BY 2.0

If an abstraction is rigid, then the unified query language (LINQ, JPQL, etc.) may be concise, but it will inevitably abstract away 80% of all useful features of the underlying data store. LINQ cannot meet the expressivity of SQL. Neither can it match the power of XPath/XQuery/XSLT/XProc, which is the most appropriate tool chain for XML. Maybe, it cannot even match what Java 8 calls the Streams API, which is very likely to become the most appropriate tool chain for objects and collections in Java.

Typesafe’s SLICK is Not About SQL

We’ve already compared Slick with jOOQ in our manual’s preface. Now, Christopher Vogt has made a clear statement about what SLICK is supposed to be and what SQL is:

There are understandable mistakes when your mind is (still) set on SQL. […]

Good luck with jooq and check back if you are ever annoyed by SQL semantics and want Scala back :).

That is only an extract of what Christopher said, of course, and there’s certainly quite a bit of goodness in SLICK. SLICK’s mission is to provide Scala collection semantics when querying databases. That might be a desireable thing to have in the Scala platform, specifically when comparing SLICK with LINQ.

But we’ve mentioned it before, on our blog. SQL is not an undesirable language or technology. Like any legacy technology, SQL has its ways. We’ve blogged about that, too, lots of times. SQL is a standard that is constantly evolving and that is here to stay. In our opinion, any technology operating on RDBMS but at the same time aiming for hiding SQL or abstracting it away completely is against the inevitable trend imposed by the big elephants who will not let go of their best-selling technologies.

SQL is about 10 years ahead of alternative RDBMS querying methods – most specifically Java, Scala, C# collection-based ones. T-SQL has now entered the TIOBE Top 10 and is considered by TIOBE to be the language of the year 2013, PL/SQL isn’t too far behind. Don’t fight SQL any longer, embrace it. Or in Christopher Vogt’s words:

Check back with SQL/jOOQ, if you are ever annoyed by the increasing amount of leaky or rigid abstraction created by modern language architects!

Further reading: “Don’t Jump the SQL Ship Just Yet”.

jOOQ Newsletter: December 30, 2013. Happy New Year!

subscribe to the newsletter here

Tweet of the Day

We would like to contribute this new section of the newsletter to our followers, users, and customers. Here’s Andy Van Den Heuvel, who appreciates jOOQ’s and MyBatis’ (both being “post-JPA” frameworks) return to SQL.

https://twitter.com/andyvdh7/status/416685921288093696

2013 from jOOQ’s perspective

2013 was a very exciting year for jOOQ and jOOQ customers. Not only did we create a company to offer commercial licensing and support, but we have also released the awesome jOOQ 3.0 major release, which brought row-level typesafety to SQL in Java. No competitor product on the market currently lifts SQL to be a first-class citizen in Java as we do. In other words, 2013 was a great year for Java and SQL in general.

After 3.0, great minor releases followed suit:

  • jOOQ 3.1 with support for MariaDB, SQL Server 2012, Oracle 12c, and a new sophisticated SPI to interact with POJO mapping
  • jOOQ 3.2 with new SPIs to interact with record and query rendering lifecycles for advanced SQL transformation. We also introduced a new matcher strategy to allow for fine-grained control over jOOQ’s code generation.

And 2014 won’t be less exciting, as we’re adding support for another popular commercial database: MS Access, besides being the first and only SQL abstraction framework to implement keyset paging in such an intuitive syntax.

In that sense, let us wish you a Happy New Year and a great start to 2014!

Convince Your Manager

It is easy for technical folks like software developers and architects to understand why jOOQ adds value to your projects. You’re the workers and as such, you need the right tools. While there are free alternatives (like Gimp), you’re a database professional who prefers to use the market leader (like Photoshop).

But the market leader costs money and your manager needs to be convinced that the money is spent wisely. No problem we told ourselves, and created this simple presentation that can help you convince your managers:

The above presentation compares jOOQ with the industry standard means of accessing RDBMS:

  • JDBC (low-level)
  • JPA (high-level)

This comparison will then be used to explain the return on investment (ROI) that your team will get by using jOOQ.

Upcoming Events

After another welcoming and interesting talk at the JUGS in Berne, we are happy to announce that our talk will now also be hosted in Hamburg, in January 2014:

Here is an overview of our upcoming events

Stay informed about 2014 events on www.jooq.org/news.

SQL Zone – LATERAL joins

Few people may be aware of the SQL:1999 standard LATERAL keyword, which can be applied to table expressions in the FROM clause, in order to allow them to cross-reference other tables within the same FROM clause. This is extremely useful when joining non-scalar table-valued functions to each record of physical tables.

T-SQL (SQL Server and Sybase) users have known this feature for a long time under the vendor-specific CROSS APPLY and OUTER APPLY join keywords. With the recent releases of PostgreSQL 9.3 and Oracle 12c, other databases can now finally also profit from this SQL goodie.

Read more about LATERAL joins and how this will be supported in jOOQ 3.3 in our blog post.

MySQL Bad Idea #666

MySQL… We’ve blogged about MySQL before. Many times. We’ve shown bad ideas implemented in MySQL here:

But this beats everything. Check out this Stack Overflow question. It reads: “Why Oracle does not support ‘group by 1,2,3’?”. At first, I thought this user might have been confused because SQL allows for referencing columns by (1-based!) index in ORDER BY clauses:

SELECT first_name, last_name
FROM customers
ORDER BY 1, 2

The above is equivalent to ORDER BY first_name, last_name. The indexes 1, 2 refer to columns from the projection. This might be useful every now and then to avoid repeating complex column expressions, although it is probably a bit risky as you can change ordering semantics when adding a column to the SELECT clause.

But this user wanted to use the same syntax for the GROUP BY clause. And this actually works in MySQL! Check out the following query:

SELECT a, b
FROM (
  SELECT 'a' a, 'b' b, 'c' c UNION ALL
  SELECT 'a'  , 'c'  , 'c'   UNION ALL
  SELECT 'a'  , 'b'  , 'd'
) t
GROUP BY 1, 2
ORDER BY 2

See an example on SQLFiddle

The above yields…

| A | B |
|---|---|
| a | b |
| a | c |

But what would this even mean? According to our in-depth explanation of SQL clauses, the projection (SELECT clause) is logically evaluated after the GROUP BY clause. In other words, the columns defined in the SELECT clause are not yet in scope of the GROUP BY clause. Hence, the only reasonable semantics of column indexes would be the index from the table source t. But that’s not the case. Check out this alternative query:

SELECT a, b
FROM (
  SELECT 'a' a, 'b' b, 'c' c UNION ALL
  SELECT 'a'  , 'c'  , 'c'   UNION ALL
  SELECT 'a'  , 'b'  , 'd'
) t
GROUP BY 1, 2
ORDER BY 2

See an example on SQLFiddle

This now yields:

| B | C |
|---|---|
| b | c |
| c | c |
| b | d |

And it’s (probably?) the expected behaviour in MySQL as the documentation states:

Columns selected for output can be referred to in ORDER BY and GROUP BY clauses using column names, column aliases, or column positions. Column positions are integers and begin with 1

The documentation actually treats GROUP BY in a very similar fashion as ORDER BY. For instance, it is possile to specify the ordering direction using GROUP BY only:

MySQL extends the GROUP BY clause so that you can also specify ASC and DESC after columns named in the clause:

SELECT a, COUNT(b) 
FROM test_table GROUP BY a DESC;

While we cannot figure out a reasonable edge-case that breaks this feature, we still think that there is something fishy about it. The fact that the SELECT clause is logically evaluated after the table source (FROM, WHERE, GROUP BY, HAVING), allowing the GROUP BY clause to reference it seems to lead to a weird understanding of SQL.

On the other hand, SQL is a very verbose language with little support for declaring reusable objects, short of common table expressions and the WINDOW clause. It is actually a bit surprising that the SQL standards folks would support this WINDOW clause to declare reusable window frames before introducing much more usable “common column expressions”, e.g:

-- Common column/table expressions:
WITH x AS CASE t1.a 
          WHEN 1 THEN 'a'
          WHEN 2 THEN 'b'
                 ELSE 'c'
          END, 
     y AS SOME_FUNCTION(t2.a, t2.b)
SELECT x, NVL(y, x)
FROM t1 JOIN t2 ON t1.id = t2.id
GROUP BY x, y
ORDER BY x DESC, y ASC

With common column expressions, reusing column expressions is independent of the SELECT clause itself. In other words, you can reuse column expressions in JOIN clauses, WHERE clauses, GROUP BY clauses, HAVING clauses, etc. without having to actually SELECT them.

So, to be fair with MySQL, while this feature is a non-feature in its current form, it provides a workaround for SQL’s verbosity.

The Great SQL Implementation Comparison Page

Fortunately, we have SQL standards. Or do we? It’s a well-known secret (or cynical joke) that the SQL standard is yet another SQL dialect among peers.

Standards by xkcd
Standards by xkcd

On this blog, we have pointed out so many differences between SQL dialects, it is hard to believe that anyone would even consider writing SQL strings rather than using jOOQ or Hibernate if they ever risk migrating their application to another database. Just consider these examples:

From the perspective of the jOOQ implementation, we know how tough it can be to abstract vendor-specific SQL away. Here’s an example piece of code that you don’t want to have in your application, dealing with the NVL() function:

switch (configuration.dialect().family()) {
  case ACCESS:
    return field("{iif}({0} is null, {1}, {0})");

  case DB2:
  case INGRES:
  case ORACLE:
  case H2:
  case HSQLDB:
    return field("{nvl}({0}, {1})", ..);

  case DERBY:
  case POSTGRES:
    return field("{coalesce}({0}, {1})", ..);

  case MARIADB:
  case MYSQL:
  case SQLITE:
    return field("{ifnull}({0}, {1})", ..);

  // By default, resort to the CASE expression
  default:
    return DSL.decode()
              .when(arg1.isNotNull(), arg1)
              .otherwise(arg2);
}

Now, trust us. The above is one of the easier cases. Consider the standardisation of the LIMIT .. OFFSET clause, for instance.

For an even more frightening overview, we recommend you visit Troels Arvin’s Comparison of Different SQL Implementations. Be warned, though!

Ecto a Slick Query DSL for the Elixir Language

Elixir? What on earth is Elixir? It is a programming language that somewhat reminds me of Ruby. Here is some example Elixir code:

defmodule Hello do
  IO.puts "Defining the function world"

  def world do
    IO.puts "Hello World"
  end

  IO.puts "Function world defined"
end

And it has a Stack Overflow community with around 50 tagged questions as of December 2013. And it has a query DSL library called Ecto, which describes itself as such:

Ecto is a domain specific language for writing queries and interacting with databases in Elixir.

Example queries can be seen on the project’s GitHub readme page:

A simple SELECT

use Ecto.Query

query = from w in Weather,
      where: w.prcp > 0 or w.prcp == nil,
     select: w

Repo.all(query)

Using JOINs

query = from p in Post,
      where: p.id == 42,
  left_join: c in p.comments,
     select: assoc(p, c)

[post] = Repo.all(query)

post.comments.to_list

Looks like a slick (but not typesafe?) querying DSL mix between LINQ and Clojure’s sqlkorma.

For those brave ones among you using the Elixir language, this might be your one (and only) choice to access a SQL database! Good luck!

MyBatis’ Alternative Transaction Management

On the jOOQ user group, we’re often being asked how to perform transaction management with jOOQ. And we have an easy answer ready: You don’t do that with jOOQ. You choose your favourite transaction management API, be it:

And the above list is far from being exhaustive. Transaction management is something very delicate, and it certainly should not be imposed by a library whose main purpose is not transaction management, because any such library / framework will provide you with at most a very leaky abstraction of its transaction model. In other words, if you just slightly want to deviate from “the standard” model (e.g. as imposed by Hibernate), you will suffer greatly, as soon as you want to run 2-3 queries outside of Hibernate – e.g. batch or reporting statements through jOOQ.

MyBatis’ Alternative Transaction Management

MyBatis is a SQL templating engine that provides a couple of features on top of alternative templating engines, such as Velocity, or StringTemplate. One of these features built on top of templating is precisely transaction management, as can be seen in the docs.

From what we can read in the docs, it looks as though MyBatis’ transaction managers can be overriden by Spring, for instance. However, it is not easy to see how this is done. In fact, given that MyBatis also solves Connection pooling (for which there are also very viable alternatives, such as c3p0 and DBCP), and mapping (which could be solved more easily with custom transformers, such as offered by Spring’s JdbcTemplate, or jOOQ’s RecordMapper).

As many frameworks, MyBatis tries to solve problems outside its core scope, which is SQL templating. While this may be a good thing as you only rely on a single dependency, it is also quite a lock-in, in case you have a more complex model. In the case of transaction management, we believe that this was not a good idea by MyBatis.

Thoughts from MyBatis users?

A Bit of SQL History

SQL has been around for a while. Just recently, I’ve read this refreshing article about Codd’s Relational Vision – Has NoSQL Come Full Circle? SQL is one of those awesome technologies that was invented almost two generations ago, and we’re still using it in production and for new projects. It is both legacy and state-of-the-art.

The RDBMS timeline from Wikipedia

In the early 70s, Ingres (which still exists today) was built at the Berkely University of California. Parts of Ingres still go strong in today’s RDBMS market under the name of PostgreSQL, and many other databases. Michael Stonebraker has been involved with these early dinosaurs, and he’s currently promoting NewSQL storage, which isn’t listed in the above chart, unfortunately.

What’s also not listed, unfortunately, are all those various NoSQL storage systems, which will eventually merge in a greater timeline of DBMS, should they survive the upcoming second renaissance of Codd’s Vision. Love it or hate it, but SQL is a piece of software category that we will deal with for another generation of software engineers!

Simplernate. A Query DSL for Hibernate, Inspired by jOOQ

Simplernate” ! The name is very compelling. And so is the idea – we think. Specifically, because we get most credit for this new project in the following short readme:
https://gist.github.com/thermz/eb1b12b2146168a08e68

It reads:

Simplernate is (will be) an Hibernate wrapper that help developer to query the database using Hibernate ORM. To do this, Simplernate offers a syntax inspired by jOOQ philosofy [sic].

This obviously reminds us of Torpedoquery, another typesafe query DSL that wraps Hibernate and Hibernate’s HQL. Unfortunately, there isn’t that much to see yet, but we will certainly follow up on future progress of a project with such a promising name!

SQL and NoSQL are Really Just Two Sides of the Same Coin

In a recent debate about NoSQL vs. SQL on Hackernews, I was made aware of a quite amusing paper by Erik Meijer and Gavin Bierman. Remember, Erik Meijer has brought LINQ to the .NET universe, a formidable unified query DSL whose main purpose was to unify typesafe querying against XML, SQL, and object-oriented data structures.

It is important to note that LINQ surfaced shortly before the NoSQL hype really caught on, so NoSQL data structures (e.g. key / value stores, document stores, graph databases) were not yet in scope for LINQ, and LINQ providers might have had to tweak the odd implementation detail to fully match the LINQ API.

What Erik Meijer and Gavin Bierman are claiming in their article (which was also discussed intensively on Hackernews) is the fact that SQL and NoSQL are duals of each other, i.e. two sides of the same coin. In their quest to unify all query languages, the LINQ people would obviously love to simplify things to such a level. To us, who are focusing on SQL only via jOOQ, this seems more like a plea for LINQ than anything else. It would be all too nice if things were as easy as a simple duality, specifically given the fact that Erik Meijer has now also created a company called Applied Duality Inc… What we found very interesting, though, is Erik’s and Gavin’s hint about the relational model and other models inversing arrows between relationships. When child records point to parent records in the relational model, parent objects point to child objects in object-oriented design, or in XML.

But history will teach us where these things go. I currently don’t see a second E.F. Codd to solve the complexity introduced with the new abundance of NoSQL data stores – yet. But maybe, we’ll eventually remember Erik Meijer as the new E.F. Codd, bringing NoSQL to full circle.

MongoDB “Lightning Fast Aggregation” Challenged with Oracle

What does “Scale” even mean in the context of databases? When talking about scaling, people have jumped to the vendor-induced conclusion that:

  • SQL doesn’t scale
  • NoSQL scales

It is very obvious that NoSQL vendors make such claims. It has also been interesting that many NoSQL consumers made such claims, even if they probably confused SQL in general with MySQL in particular. They then go on about comparing MongoDB with MySQL scalability, which totally makes sense as MySQL is to SQL what MongoDB is to NoSQL.

Let’s get back down to earth…

… because in the last decades, there was no database to beat Oracle on the Transaction Processing Performance Council’s benchmarks. We trust that the CERN people have made an informed decision when they opted for Oracle Exadata and other Oracle products to manage their immensely huge data even before huge data was called Big Data.

So let’s make a quick comparison. Recently, Vlad Mihalcea has blogged about “lightning speed aggregation” (with MongoDB). He took a data set of 50 million records of the form:

created_on           | value
-------------------------------------------
2012-05-02T06:08:47Z | 0.9270193106494844
2012-09-06T22:40:25Z | 0.005334891844540834
2012-06-15T05:58:22Z | 0.05611344985663891
...                  | ...

These are random timestamps and random floats. He then aggregated this data with the following query:

var dataSet = db.randomData.aggregate([
    {
        $group: {
                "_id": {
                    "year" : {
                        $year : "$created_on"
                    },
                    "dayOfYear" : {
                        $dayOfYear : "$created_on"
                    }
                },
                "count": {
                    $sum: 1
                },
                "avg": {
                    $avg: "$value"
                },
                "min": {
                    $min: "$value"
                },
                "max": {
                    $max: "$value"
                }
            }
    },
    {
        $sort: {
            "_id.year" : 1,
            "_id.dayOfYear" : 1
        }
    }
]);

And got a “mind-numbing”, lightning speed aggregation result:

Aggregation took:129.052s

129s for a medium-sized table with 50M records is lightning speed for MongoDB? Alright, we thought. Let’s try this with Oracle. Vlad had the courtesy to provide us with the sample data, which we imported into the following trivial Oracle table:

CREATE TABLESPACE aggregation_test
  DATAFILE 'aggregation_test.dbf'
  SIZE 2000M ONLINE;

CREATE TABLE aggregation_test (
  created_on TIMESTAMP NOT NULL,
  value NUMBER(22, 20) NOT NULL
)
TABLESPACE aggregation_test;

Now, let’s load that data with sqlldr into my single-core licenced Oracle XE 11gR2 instance:

OPTIONS(skip=1)
LOAD DATA
  INFILE randomData.csv
  APPEND
INTO TABLE aggregation_test
  FIELDS TERMINATED BY ','
(
  created_on DATE
    "YYYY-MM-DD\"T\"HH24:MI:SS\"Z\"",
  value TERMINATED BY WHITESPACE
    "to_number(ltrim(rtrim(replace(:value,'.',','))))"
)

And then:

C:\oraclexe\app\oracle\product\11.2.0\server\bin\sqlldr.exe 
  userid=TEST/TEST 
  control=randomData.txt 
  log=randomData.log 
  parallel=true 
  silent=feedback
  bindsize=512000 
  direct=true

Loading the records took a while on my computer with an old disk:

Elapsed time was:     00:03:07.70
CPU time was:         00:01:09.82

I might be able to tune this one way or another, as SQL*Loader isn’t the fastest tool for the job. But let’s challenge the 129s for the aggregation. So far, we hadn’t specified any index, but that isn’t necessary as we’re aggregating the whole table with all 50M records. Let’s do this with the following equivalent query to Vlad’s:

SELECT
  EXTRACT(YEAR FROM created_on),
  TO_CHAR(created_on, 'DDD'),
  COUNT(*),
  AVG(value),
  MIN(value),
  MAX(value)
FROM aggregation_test
GROUP BY
  EXTRACT(YEAR FROM created_on),
  TO_CHAR(created_on, 'DDD')
ORDER BY
  EXTRACT(YEAR FROM created_on),
  TO_CHAR(created_on, 'DDD')

This took 32s on my machine with an obvious full table scan. Not impressive. Let’s try Vlad’s other query, filtering on a single hour, which executed in 209ms in his benchmark (what we believe is really not fast at all):

var dataSet = db.randomData.aggregate([
    {
        $match: {
            "created_on" : {
                $gte: fromDate,
                $lt : toDate
            }
        }
    },
    {
        $group: {
                "_id": {
                    "year" : {
                        $year : "$created_on"
                    },
                    "dayOfYear" : {
                        $dayOfYear : "$created_on"
                    },
                    "hour" : {
                        $hour : "$created_on"
                    }
                },
                "count": {
                    $sum: 1
                },
                "avg": {
                    $avg: "$value"
                },
                "min": {
                    $min: "$value"
                },
                "max": {
                    $max: "$value"
                }
            }
    },
    {
        $sort: {
            "_id.year" : 1,
            "_id.dayOfYear" : 1,
            "_id.hour" : 1
        }
    }
]);

Vlad generated a random date, which he logged as

Aggregating 
from Mon Jul 16 2012 00:00:00 GMT+0300
  to Mon Jul 16 2012 01:00:00 GMT+0300

So let’s use exactly the same dates. But first, we should create an index on AGGREGATION_TEST(CREATED_ON):

CREATE TABLESPACE aggregation_test_index
  DATAFILE 'aggregation_test_index1.dbf'
  SIZE 2000M ONLINE;

CREATE INDEX idx_created_on
ON aggregation_test(created_on)
TABLESPACE aggregation_test_index;

OK. Now let’s run the Oracle equivalent of Vlad’s query (note, there’s a new column in the GROUP BY, SELECT, ORDER BY clauses):

SELECT
  EXTRACT(YEAR FROM created_on),
  TO_CHAR(created_on, 'DDD'),
  EXTRACT(HOUR FROM created_on),
  COUNT(*),
  AVG(value),
  MIN(value),
  MAX(value)
FROM aggregation_test
WHERE created_on
  BETWEEN TIMESTAMP '2012-07-16 00:00:00.0'
  AND     TIMESTAMP '2012-07-16 01:00:00.0'
GROUP BY
  EXTRACT(YEAR FROM created_on),
  TO_CHAR(created_on, 'DDD'),
  EXTRACT(HOUR FROM created_on)
ORDER BY
  EXTRACT(YEAR FROM created_on),
  TO_CHAR(created_on, 'DDD'),
  EXTRACT(HOUR FROM created_on);

This took 20s in a first run. To be fair, before running this statement, I cleared some caches. I believe that Vlad’s MongoDB was already “warmed up” for the second query:

alter system flush shared_pool;
alter system flush buffer_cache;

Because when I ran the same query again, it only took 0.02 seconds as Oracle’s buffer cache kicked in, preventing actual disk access. I might have tuned my Oracle instance in a way to keep the whole table in memory from the first query, in case of which Oracle would have beat MongoDB by an order of magnitude.

Another option is to tune indexing. Let’s remove our existing index and replace it with a “covering” index as such:

DROP INDEX idx_created_on;

CREATE INDEX idx_created_on_value
ON aggregation_test(created_on, value)
TABLESPACE aggregation_test_index;

Let’s again flush caches:

alter system flush shared_pool;
alter system flush buffer_cache;

And run the query…

  • First execution: 0.5s
  • Second execution: 0.005s

From the execution plan, I can now tell that the query didn’t need to access the table any longer. All relevant data was contained in the index:

plan-2

Now that starts getting impressive, right?

So, what conclusion can we draw from this?

Here are a couple of findings:

Don’t jump to conclusions

First off, we should be careful with jumping to any conclusion at all. We are comparing apples with oranges. We’d even be comparing apples with oranges when comparing Oracle with SQL Server. Both benchmarks were quickly written, hacked-up benchmarks that do not represent any productive situations. What we can say already now is that there is not a significant winner when aggregating data in a 50M records table, but Oracle seems to perform better very quickly on an SSD (MongoDB) vs. HDD (Oracle) disk benchmark!

To be fair, my computer has a better CPU than Vlad’s:

system
My Computer

 

Vlad’s Computer

 

Both SQL and NoSQL can scale

Much of the NoSQL scaling debate is FUD. People always want to believe vendors who tend to say “their product doesn’t do this and that”. Don’t buy it immediately. Run a smallish benchmark as the above and see for yourself, scaling up is not a problem for both SQL and NoSQL databases. Don’t be blinded by Oracle’s initial “slow” query execution. Oracle is a very sophisticated database that can tune itself according to actual need, thanks to its cost-based optimiser and its statistics.

And, nothing keeps you (and your DBA) from using all of your tool’s features. In this example, we have only scratched the surface. For instance, we might apply IOT (Index-organised Table) or table partitioning, if you’re using Oracle Enterprise Edition.

50M is not Big Data

CERN has Big Data. Google does. Facebook does. You don’t. 50M is not “Big Data”. It’s just your average database table.

200ms is not fast

Obviously, you might not care too much how long it takes to prepare your offline report on a dedicated batch reporting server. But if you have a real user on a real UI waiting for it… If they do, 200ms is not fast, because they might be running 20 of these queries. And there might be 10’000 of these users. In a real OLTP system, having such reporting queries around even means that Oracle 0.005s aren’t fast!

So, in other words…

… don’t jump the SQL ship just yet!