jOOQ Tuesdays: Glenn Paulley Gives Insight into SQL’s History

Welcome to the jOOQ Tuesdays series. In this series, we’ll publish an article on the third Tuesday every other month where we interview someone we find exciting in our industry from a jOOQ perspective. This includes people who work with SQL, Java, Open Source, and a variety of other related topics.

glenn-paulley

I’m very excited to feature today Glenn Paulley who has been working with and on SQL for several decades.

Glenn, you have been a part of the database ecosystem since the very early days, having been Director of Engineering at Sybase and representing SAP with the SQL Standard committee. What is it that fascinates you so much about databases?

Data management technology has been my favourite subject within Computer Science since I was an undergraduate student at the University of Manitoba, and so it has remained throughout much of my career. I was privileged to be part of the team that implemented the first online IBM DB2 application – using IMS/TM as the transaction monitor – in Canada at Great-West Life Assurance in the late 1980’s, and we hosted some of the DB2 development team from IBM’s San Jose lab – including Don Haderle – to celebrate that achievement. So yes, I guess you could say that I’ve been around for a while.

Much of my personal expertise lies in the realm of query processing and optimization, having had Paul Larson and Frank Wm. Tompa as my Ph.D. supervisors at the University of Waterloo. But I am interested in many related database subjects: query languages certainly, multidatabase systems, and information retrieval. Two closely-related topics within database systems are of particular interest to me.  One is scale. Companies and organizations have a lot of data; billion-row tables in a relational database are fairly routine today in many companies. Advances in other technologies, such as the Internet of Things, are going to dramatically increase the data management requirements for firms wanting to take advantage of these technologies. Implementing solutions to achieve the scale required is difficult, and is of great interest to me. The related issue, quite naturally, is processing queries over such vast collections and getting the execution time down to something reasonable. So query optimization remains a favourite topic.

We’ve had a very interesting E-Mail conversation about SQL’s experiments related to Object Orientation in the late 1990’s. Today, we have ORDBMS like Oracle, Informix, PostgreSQL. Why didn’t ORDBMS succeed?

ORDBMS implementations were not as successful as their developers expected, to be sure, though I would note that many SQL implementations now contain “object” features even though objects are specifically omitted from the ISO SQL standard. Oracle 12c is a good example – Oracle’s PL/SQL object implementation supports a good selection of object-oriented programming features, such as polymorphism and single inheritance, that when coupled with collection types provide a very rich data model that can handle very complex applications. There are others, too, of course: InterSystems’ Caché product, for example, is still available.

So, while object support in relational systems is present, in many instances, to me the significant issues are (1) the performance of object constructions on larger database instances and (2) the question of where do you want objects to exist in the application stack: in the server proper, or in another tier, implemented in a true object-oriented language? The latter issue is the premise behind object-relational mapping tools, though I think that their usage often causes as many problems as they solve.

How do you see the future of the SQL language – e.g. with respect to Big Data or alternative models like document stores (which have N1QL) or graph stores (which have Open Cypher)?

My personal view is that SQL will continue to evolve; having an independent query language that permits one to query or manipulate a database but avoid writing a “program” in the traditional sense is a good idea. Over time that language might evolve to something different from today’s SQL, but it is likely that we will still call it S-Q-L. I do not expect a revolutionary approach to be successful; there is simply far too much invested in current applications and infrastructure. Don’t get me wrong – I’m not saying SQL doesn’t need improvements. It does.

Unlike the work of e.g. the JCP or w3c, which are public and open, SQL still seems to be a more closed environment. How do you see the interaction between SQL and the end user? Can “ordinary” folks participate in the future of SQL?

The SQL standard is published by the International Standards Organization (ISO), whose member countries contribute to changes to the standard and vote on them on a regular basis. Member countries that contribute to the ISO SQL standard include (at least) the US, Canada, Germany, the United Kingdom, Japan, and Korea. Participation in the standards process requires individuals or companies to belong to these “national bodies” in order to view drafts of the standard and vote on proposed changes. Usually those meetings are held in-person – at least in Canada and the United States – so there is a real cost to participation in the process.

While having SQL as an international ISO standard is, I think, worthwhile, the ISO’s business model is based on activities such as collecting revenue from the sale of standards documents. Unless governments or other benefactors sponsor the development of those standards, then it is difficult to see how the standard can be made more freely available. That issue is a political one, rather than a technical one.

After a brief stop at Conestoga College, you’re heading back to SAP. What will you be working on in the next few years?

I will be continuing to focus on database technology now that I’ve returned to SAP. So far I’ve had a fantastic experience being back at the SAP Waterloo lab, and I have every expectation that will continue into the future.

You’ve spent a lot of time at SAP (and before SAP) with Sybase. How do the different Sybase distributions relate to SAP’s new flagship database HANA?

The various SAP database systems (SQL Anywhere, IQ, ASE) all contain database technology pertinent to the HANA platform.

Last question: What do SQL and Curling have in common? :-)

As a Level 3 curling coach, one of my tasks as a High Performance Coach for the Ontario Curling Council is to collect performance data on the athletes in our various programs. Naturally I use a database system to store that data, and perform various analyses on it, so the two are not as unrelated as you might think!

 

The DBMS of the Year 2013

We have recently blogged about the DB-Engines Ranking and how MongoDB was the only NoSQL store to make it into that ranking’s top 10. Today, this marketing platform offered by solid IT has announced MongoDB to be the DBMS of the year 2013, with PostgreSQL being a close runner-up, followed by Cassandra.

solid IT as a company is slightly biased towards NoSQL, so it’s not surprising that two NoSQL databases are in their top ranking, and the only successful ORDBMS in the market is number two. As we ourselves are “slightly” biased towards SQL, we would like to announce our own DBMS of the year 2013:

SQL Server is the DBMS of the year 2013

… because its SQL dialect Transact-SQL (which Microsoft “shares” with Sybase), is the first SQL-based programming language to make it into TIOBE’s top 10 programming languages.

Congratulations to SQL Server from the jOOQ team!

The Crystal Ball. Or, Oops, Michael Stonebraker did it Again

Michael Stonebraker’s opinions and claims are always refreshing to read. He’s done a lot for our industry and for how we do data processing. Some of his claims are certainly right as well. Here’s an interview with him, telling us about his 5 predictions on the future of databases.

Of course, him being a software vendor, many of his claims should be read with caution. Today, the most popular DBMS (relational or not) are still Oracle, MySQL, and SQL Server. Even his “popular” PostgreSQL is still a niche player, let alone the almost forgotten Ingres and the never really popular Vertica columnar “NewSQL” database. Obviously, we’re not saying they’re bad databases, but they’re certainly not very popular. The same goes with SAP. Their Sybase databases have been surpassed by SQL Server both in quality and in popularity 10 years ago, when Microsoft forked the Sybase code into T-SQL. We hardly believe that Oracle and Sybase will have the “final fight” for RDBMS supremacy.

But again. That’s Mike Stonebraker, the salesman as in “The Traditional RDBMS Wisdom is All Wrong”

SAP’s Hilarious SQL Whitepaper(s)

While looking for some authoritative information about Sybase SQL Anywhere 12’s TOP .. START AT clause, I stumbled upon this hilarious white paper here, which I do not want to keep from you:
http://www.sybase.com/files/White_Papers/Sybase_Top_10_Features_In_SQL_Anywhere_12.pdf

I will take advantage of “fair use policy” and cite parts from section 7:

Feature number 7: improved support for DaffySQL syntax

If I told you that RowGenerator.row_num contains the values 1 through 255, what would you say this query returned?

[Query example]

Give up? OK, how about this one?

[Query example]

Still stumped? If I told you they both returned exactly the same result set as the following query, what would you say?

[Query example]

Yes, the LIMIT clause is new to SQL Anywhere 12, exactly the same as TOP START AT except it uses zero as the starting point for numbering rows instead of 1.

An “offset”, get it?

As in “Here’s ten dollars, let me count it for you: 0, 1, 2, 3, 4, 5, 6, 7, 8, 9.”

Why implement LIMIT? And why include it in a list of cool features?

Because there are a lot of MySQL users out there who don’t have TOP START AT, and they’ve written zillions of queries using LIMIT, and they’d like to migrate their apps to SQL Anywhere without rewriting everything. And PostgreSQL users too… welcome aboard!

Migrating to SQL Anywhere is definitely cool.

So be cool and migrate to SQL Anywhere already! :-) I’m now going through the rest of this fun document.

Cloud Fever now also at Sybase

After SQL Server (SQL Azure) and MySQL (Google Cloud SQL), there is now also a SQL Anywhere database available in the cloud:

http://www.sybase.ch/fujibeta

It’s called Sybase SQL Anywhere OnDemand or code name Fuji. I guess the connotation is that your data might as well be relocated to Fuji. Or your DBA might as well work from Fuji. Who knows ;-)

I don’t know where to start adding integration tests for jOOQ with all those cloud-based SQL solutions. Anyway, exciting times are coming!

There’s still potential for new SQL dialects

A recent feature request reminded me, that there is still a lot of potential for new SQL dialects supported by jOOQ. User Philippe is considering jOOQ for various projects in his organisation, and one dialect he’s missing is that of Sybase‘s Adaptive Server Enterprise. Sybase is one of the older databases, that is still widely used, especially in the health industry sector. Its history started in the mid-80’s where it was at some point forked into Microsoft’s SQL Server. This is well reflected in jOOQ, where you can see that many dialect-specific behaviour patterns are the same for Sybase SQL Anywhere and SQL Server. Some common heritage from the “Ingres dinosaur” can also be seen.

jOOQ’s Sybase support is currently limited to SQL Anywhere, but there’s no reason not to support Sybase ASE as well! I’m looking forward to contributions from Philippe. Note, there are also other dialects that are worth considering for support by jOOQ:

See a nice chart about RDBMS history here:
http://en.wikipedia.org/wiki/User:Intgr/RDBMS_timeline

See the discussion with Philippe here:
http://groups.google.com/group/jooq-user/browse_thread/thread/130b44cb59b14a82