Archive by Author | lukaseder

PL/SQL backtraces for debugging


For many PL/SQL developers, this might be common sense, but for one of our customers, this was an unknown PL/SQL feature: Backtraces.

When your application raises an error somewhere deep down in the call stack, you don’t get immediate information about the exact source of the error. For large PL/SQL applications, this can be a pain. One workaround is to keep track of statement numbers that were last executed before any error occurred:

DECLARE
  v_statement_no := 0;
BEGIN
  v_statement_no := 1;
  SELECT ...

  v_statement_no := 2;
  INSERT ...

  v_statement_no := 3;
  ...
EXCEPTION
  WHEN OTHERS THEN
    -- Log error message somewhere
    logger.error(module, v_statement_no, sqlerrm);
END;

The above looks an awful lot like println-debugging, a thing that isn’t really known to Java developers ;-)

But println-debugging isn’t necessary in PL/SQL either. Use the DBMS_UTILITY.FORMAT_ERROR_BACKTRACE function, instead! An example:

DECLARE
  PROCEDURE p4 IS BEGIN
    raise_application_error(-20000, 'Some Error');
  END p4;
  PROCEDURE p3 IS BEGIN
    p4;
  END p3;
  PROCEDURE p2 IS BEGIN
    p3;
  END p2;
  PROCEDURE p1 IS BEGIN
    p2;
  END p1;

BEGIN
  p1;
EXCEPTION
  WHEN OTHERS THEN
    dbms_output.put_line(sqlerrm);
    dbms_output.put_line(
      dbms_utility.format_error_backtrace
    );
END;
/

The above PL/SQL block generates the following output:

ORA-20000: Some Error
ORA-06512: at line 3
ORA-06512: at line 6
ORA-06512: at line 9
ORA-06512: at line 12
ORA-06512: at line 16

You can see exactly what line number generated the error. If you’re not using local procedures in anonymous blocks (which you quite likely aren’t), this gets even more useful:

CREATE PROCEDURE p4 IS BEGIN
  raise_application_error(-20000, 'Some Error');
END p4;
/
CREATE PROCEDURE p3 IS BEGIN
  p4;
END p3;
/
CREATE PROCEDURE p2 IS BEGIN
  p3;
END p2;
/
CREATE PROCEDURE p1 IS BEGIN
  p2;
END p1;
/

BEGIN
  p1;
EXCEPTION
  WHEN OTHERS THEN
    dbms_output.put_line(sqlerrm);
    dbms_output.put_line(
      dbms_utility.format_error_backtrace
    );
END;
/

The above now outputs:

ORA-20000: Some Error
ORA-06512: at "PLAYGROUND.P4", line 2
ORA-06512: at "PLAYGROUND.P3", line 2
ORA-06512: at "PLAYGROUND.P2", line 2
ORA-06512: at "PLAYGROUND.P1", line 2
ORA-06512: at line 2

To learn more about the DBMS_UTILITY package, please consider the manual. True to the nature of all things called “UTILITY”, it really contains pretty much random things that you wouldn’t expect there :-)

Integrating jOOQ with PostgreSQL: Partitioning


Introduction

jOOQ is a great framework when you want to work with SQL in Java without having too much ORM in your way. At the same time, it can be integrated into many environments as it is offering you support for many database-specific features. One such database-specific feature is partitioning in PostgreSQL. Partitioning in PostgreSQL is mainly used for performance reasons because it can improve query performance in certain situations. jOOQ has no explicit support for this feature but it can be integrated quite easily as we will show you.

This article is brought to you by the Germany based jOOQ integration partner UWS Software Service (UWS). UWS is specialised in custom software development, application modernisation and outsourcing with a distinct focus on the Java Enterprise ecosystem.

Partitioning in PostgreSQL

With the partitioning feature of PostgreSQL you have the possibility of splitting data that would form a huge table into multiple separate tables. Each of the partitions is a normal table which inherits its columns and constraints from a parent table. This so-called table inheritance can be used for “range partitioning” where, for example, the data from one range does not overlap the data from another range in terms of identifiers, dates or other criteria.

Like in the following example, you can have partitioning for a table “author” that shares the same foreign-key of a table “authorgroup” in all its rows.

CREATE TABLE author (
  authorgroup_id int,
  LastName varchar(255)
);

CREATE TABLE author_1 (
  CONSTRAINT authorgroup_id_check_1
    CHECK ((authorgroup_id = 1))
) INHERITS (author);

CREATE TABLE author_2 (
  CONSTRAINT authorgroup_id_check_2
    CHECK ((authorgroup_id = 2))
) INHERITS (author);

...

As you can see, we set up inheritance and – in order to have a simple example – we just put one constraint checking that the partitions have the same “authorgroup_id”. Basically, this results in the “author” table only containing table and column definitions, but no data. However, when querying the “author” table, PostgreSQL will really query all the inheriting “author_n” tables returning a combined result.

A trivial approach to using jOOQ with partitioning

In order to work with the partitioning described above, jOOQ offers several options. You can use the default way which is to let jOOQ generate one class per table. In order to insert data into multiple tables, you would have to use different classes. This approach is used in the following snippet:

// add
InsertQuery query1 = dsl.insertQuery(AUTHOR_1);
query1.addValue(AUTHOR_1.ID, 1);
query1.addValue(AUTHOR_1.LAST_NAME, "Nowak");
query1.execute();

InsertQuery query2 = dsl.insertQuery(AUTHOR_2);
query2.addValue(AUTHOR_2.ID, 1);
query2.addValue(AUTHOR_2.LAST_NAME, "Nowak");
query2.execute();

// select
Assert.assertTrue(dsl
    .selectFrom(AUTHOR_1)
    .where(AUTHOR_1.LAST_NAME.eq("Nowak"))
    .fetch().size() == 1);

Assert.assertTrue(dsl
    .selectFrom(AUTHOR_2)
    .where(AUTHOR_2.LAST_NAME.eq("Nowak"))
    .fetch().size() == 1);

You can see that multiple classes generated by jOOQ need to be used, so depending on how many partitions you have, generated classes can pollute your codebase. Also, imagine that you eventually need to iterate over partitions, which would be cumbersome to do with this approach. Another approach could be that you use jOOQ to build fields and tables using string manipulation but that is error prone again and prevents support for generic type safety. Also, consider the case where you want true data separation in terms of multi-tenancy.

You see that there are some considerations to do when working with partitioning. Fortunately jOOQ offers various ways of working with partitioned tables, and in the following we’ll compare approaches, so that you can choose the one most suitable for you.

Using jOOQ with partitioning and multi-tenancy

JOOQ’s runtime-schema mapping is often used to realize database environments, such that for example during development, one database is queried but when deployed to production, the queries are going to another database. Multi-tenancy is another recommended use case for runtime-schema mapping as it allows for strict partitioning and for configuring your application to only use databases or tables being configured in the runtime-schema mapping. So running the same code would result in working with different databases or tables depending on the configuration, which allows for true separation of data in terms of multi-tenancy.

The following configuration taken from the jOOQ documentation is executed when creating the DSLContext so it can be considered a system-wide setting:

Settings settings = new Settings()
  .withRenderMapping(new RenderMapping()
  .withSchemata(
      new MappedSchema().withInput("DEV")
                        .withOutput("MY_BOOK_WORLD")
                        .withTables(
      new MappedTable().withInput("AUTHOR")
                       .withOutput("AUTHOR_1"))));

// Add the settings to the Configuration
DSLContext create = DSL.using(
  connection, SQLDialect.ORACLE, settings);

// Run queries with the "mapped" configuration
create.selectFrom(AUTHOR).fetch();

// results in SQL:
// “SELECT * FROM MY_BOOK_WORLD.AUTHOR_1”

Using this approach you can map one table to one partition permanently eg. “AUTHOR” to “AUTHOR_1” for environment “DEV”. In another environment you could choose to map “AUTHOR” table to “AUTHOR_2”.

Runtime-schema mapping only allows you to map to exactly one table on a per-query basis, so you could not handle the use case where you would want to manipulate more than one table partition. If you would like to have more flexibility you might want to consider the next approach.

Using jOOQ with partitioning and without multi-tenancy

If you need to handle multiple table partitions without having multi-tenancy, you need a more flexible way of accessing partitions. The following example shows how you can do it in a dynamic and type safe way, avoiding errors and being usable in the same elegant way you are used to by jOOQ:

// add
for(int i=1; i<=2; i++) {
  Builder part = forPartition(i);
  InsertQuery query = dsl.insertQuery(part.table(AUTHOR));
  query.addValue(part.field(AUTHOR.ID), 1);
  query.addValue(part.field(AUTHOR.LAST_NAME), "Nowak");
  query.execute();
}

// select

for(int i=1; i<=2; i++) {
  Builder part = forPartition(i);
  Assert.assertTrue(dsl
      .selectFrom(part.table(AUTHOR))
      .where(part.field(AUTHOR.LAST_NAME).eq("Nowak"))
      .fetch()
      .size() == 1);
}

What you can see above is that the partition numbers are abstracted away so that you can use “AUTHOR” table instead of “AUTHOR_1”. Thus, your code won’t be polluted with many generated classes. Another thing is that the partitioner object is initialized dynamically so you can use it for example in a loop like above. Also it follows the Builder pattern so that you can operate on it like you are used to by jOOQ.

The code above is doing exactly the same as the first trivial snippet, but there are multiple benefits like type safe and reusable access to partitioned tables.

Integration of jOOQ partitioning without multi-tenancy into a Maven build process (optional)

If you are using Continuous-Integration you can integrate the solution above so that jOOQ is not generating tables for the partitioned tables. This can be achieved using a regular expression that excludes certain table names when generating Java classes. When using Maven, your integration might look something like this:

<generator>
  <name>org.jooq.util.DefaultGenerator</name>
  <database>
    <name>org.jooq.util.postgres.PostgresDatabase</name>
    <includes>.*</includes>
    <excludes>.*_[0-9]+</excludes>
    <inputSchema>${db.schema}</inputSchema>
  </database>
  <target>
    <packageName>com.your.company.jooq</packageName>
    <directory>target/generated-sources/jooq</directory>
  </target>
</generator>

Then it’s just calling mvn install and jOOQ maven plugin will be generating the database schema in compilation time.

Integrating jOOQ with PostgreSQL: Partitioning

This article described how jOOQ in combination with the partitioning feature of PostgreSQL can be used to implement multi-tenancy and improve database performance. PostgreSQL’s documentation states that for partitioning “the benefits will normally be worthwhile only when a table would otherwise be very large. The exact point at which a table will benefit from partitioning depends on the application, although a rule of thumb is that the size of the table should exceed the physical memory of the database server.”

Achieving support for partitioning with jOOQ is as easy as adding configuration or a small utility class, jOOQ is then able to support partitioning with or without multi-tenancy and without sacrificing type safety. Apart from Java-level integration, the described solution also smoothly integrates into your build and test process.

You may want to look at the sources of the partitioner utility class which also includes a test-class so that you can see the behavior and integration in more detail.

Please let us know if you need support for this or other jOOQ integrations within your environment. UWS Software Service (UWS) is an official jOOQ integration partner.

All You Ever Need to Know About Recursive SQL


Oracle SYNONYMs are a great feature. You can implement all sorts of backwards-compatibility tweaks simply by creating SYNONYMs in your database. Consider the following schema:

CREATE TABLE my_table (col NUMBER(7));

CREATE SYNONYM my_table_old FOR my_table;
CREATE SYNONYM my_table_bak FOR my_table_old;

Now you can query your same old table through three different names, it’ll all result in the same output:

SELECT * FROM my_table;

-- Same thing:
SELECT * FROM my_table_old;
SELECT * FROM my_table_bak;

The trouble is, when you see my_table_bak in code (or some even more obfuscated name), do you immediately know what it really is?

Use this query to find out

We can use the ALL_SYNONYMS table to figure this one out. This query will already give a simple overview:

SELECT *
FROM   ALL_SYNONYMS
WHERE  TABLE_OWNER = 'PLAYGROUND'

The output is:

OWNER       SYNONYM_NAME  TABLE_OWNER  TABLE_NAME
---------------------------------------------------
PLAYGROUND  MY_TABLE_BAK  PLAYGROUND   MY_TABLE_OLD
PLAYGROUND  MY_TABLE_OLD  PLAYGROUND   MY_TABLE

But as you can see, this is boring, because we have transitive synonyms in there and I don’t want to go through the complete table to figure out that MY_TABLE_BAK -> MY_TABLE_OLD -> MY_TABLE.

So let’s use CONNECT BY!

Oracle (as well as Informix and CUBRID) have this awesome CONNECT BY clause for hierarchical SQL. There is also the possibility to express hierarchical SQL using the more powerful common table expressions, if you dare.

But let’s see how we can transitively resolve our tables. Here’s how:

SELECT 
  s.OWNER,
  s.SYNONYM_NAME,

  -- Get to the root of the hierarchy
  CONNECT_BY_ROOT s.TABLE_OWNER TABLE_OWNER,
  CONNECT_BY_ROOT s.TABLE_NAME  TABLE_NAME
FROM       ALL_SYNONYMS s
WHERE      s.TABLE_OWNER = 'PLAYGROUND'

-- The magic CONNECT BY clause!
CONNECT BY s.TABLE_OWNER = PRIOR s.OWNER
AND        s.TABLE_NAME  = PRIOR s.SYNONYM_NAME

First off, there is CONNECT BY, which allows to “connect” hierarchies by their hierarchical predecessors. On each level of the hierarchy, we’ll connect the TABLE_NAME with its previous (“PRIOR”) SYNONYM_NAME. This will recurse as long as the chain doesn’t end (or if it runs into a cycle).

What’s also interesting is the CONNECT_BY_ROOT keyword, which, for each path through the hierarchy, displays the root of the path. In our case, that’s the target TABLE_NAME.

The output can be seen here:

OWNER       SYNONYM_NAME  TABLE_OWNER  TABLE_NAME
---------------------------------------------------
PLAYGROUND  MY_TABLE_OLD  PLAYGROUND   MY_TABLE
PLAYGROUND  MY_TABLE_BAK  PLAYGROUND   MY_TABLE
PLAYGROUND  MY_TABLE_BAK  PLAYGROUND   MY_TABLE_OLD <-- Useless

If you’re confused by the records that are displayed, just add the LEVEL pseudo-column to display the recursion level:

SELECT

  -- Add level here
  LEVEL,
  s.OWNER,
  s.SYNONYM_NAME,
  CONNECT_BY_ROOT s.TABLE_OWNER TABLE_OWNER,
  CONNECT_BY_ROOT s.TABLE_NAME  TABLE_NAME
FROM       ALL_SYNONYMS s
WHERE      s.TABLE_OWNER = 'PLAYGROUND'
CONNECT BY s.TABLE_OWNER = PRIOR s.OWNER
AND        s.TABLE_NAME  = PRIOR s.SYNONYM_NAME
LEVEL  OWNER       SYNONYM_NAME  TABLE_OWNER  TABLE_NAME
----------------------------------------------------------
1      PLAYGROUND  MY_TABLE_OLD  PLAYGROUND   MY_TABLE
2      PLAYGROUND  MY_TABLE_BAK  PLAYGROUND   MY_TABLE
1      PLAYGROUND  MY_TABLE_BAK  PLAYGROUND   MY_TABLE_OLD
^^^^^^
  Awesome!

Getting rid of “bad records” using START WITH

As you can see, some of the results are now synonyms pointing directly to the target table, whereas the last record still points to an intermediate element from the synonym path. This is because we’re recursing into the path hierarchies from every record in the table, also from the “intermediate” synonym references, whose TABLE_NAME is yet another synonym.

Let’s get rid of those as well, using the optional START WITH clause, which allows to limit tree traversals to those trees whose roots fulfil a given predicate:

SELECT 
  s.OWNER,
  s.SYNONYM_NAME,
  CONNECT_BY_ROOT s.TABLE_OWNER TABLE_OWNER,
  CONNECT_BY_ROOT s.TABLE_NAME  TABLE_NAME
FROM       ALL_SYNONYMS s
WHERE      s.TABLE_OWNER = 'PLAYGROUND'
CONNECT BY s.TABLE_OWNER = PRIOR s.OWNER
AND        s.TABLE_NAME  = PRIOR s.SYNONYM_NAME

-- Start recursing only from non-synonym objects
START WITH EXISTS (
  SELECT 1
  FROM   ALL_OBJECTS
  WHERE  s.TABLE_OWNER           = ALL_OBJECTS.OWNER
  AND    s.TABLE_NAME            = ALL_OBJECTS.OBJECT_NAME
  AND    ALL_OBJECTS.OWNER       = 'PLAYGROUND'
  AND    ALL_OBJECTS.OBJECT_TYPE <> 'SYNONYM'
)

So, essentially, we’re requiring the TABLE_NAME to be any object from ALL_OBJECTS that is in our schema, but not a SYNONYM. (yes, synonyms work for all objects, including procedures, packages, types, etc.)

Running the above query gets us the desired result:

OWNER       SYNONYM_NAME  TABLE_OWNER  TABLE_NAME
---------------------------------------------------
PLAYGROUND  MY_TABLE_OLD  PLAYGROUND   MY_TABLE
PLAYGROUND  MY_TABLE_BAK  PLAYGROUND   MY_TABLE

What about PUBLIC synonyms?

Most often, you will not use local synonyms, though, but PUBLIC ones. Oracle has this quirky PUBLIC pseudo-schema, in which you cannot create objects, but in which you can create synonyms. So, let’s create some more synonyms for backwards-compatibility purposes:

CREATE PUBLIC SYNONYM my_table_bak2 FOR my_table_bak;
CREATE SYNONYM bak_backup_old FOR my_table_bak2;

Unfortunately, this will break our chain, because for some reason only Oracle and the Oracle of Delphi knows, PUBLIC is well reported as a OWNER of the synonym, but not as the TABLE_OWNER. Let’s see some raw data with:

SELECT *
FROM   ALL_SYNONYMS
WHERE  TABLE_OWNER = 'PLAYGROUND'

… and thus:

OWNER       SYNONYM_NAME    TABLE_OWNER  TABLE_NAME
------------------------------------------------------
PLAYGROUND  MY_TABLE_OLD    PLAYGROUND   MY_TABLE
PLAYGROUND  MY_TABLE_BAK    PLAYGROUND   MY_TABLE_OLD
PUBLIC      MY_TABLE_BAK2   PLAYGROUND   MY_TABLE_BAK
PLAYGROUND  BAK_BACKUP_OLD  PLAYGROUND   MY_TABLE_BAK2 <-- Not PUBLIC

As you can see, the PUBLIC SYNONYM MY_TABLE_BAK2 is reported to be in the PLAYGROUND schema! This breaks recursion, of course. We’re missing a record:

OWNER       SYNONYM_NAME    TABLE_OWNER  TABLE_NAME
------------------------------------------------------
PLAYGROUND  MY_TABLE_OLD    PLAYGROUND   MY_TABLE
PLAYGROUND  MY_TABLE_BAK    PLAYGROUND   MY_TABLE
PUBLIC      MY_TABLE_BAK2   PLAYGROUND   MY_TABLE <-- Hmm?

In order to work around this issue, we’ll have to tweak our original data set. Any object reported as (TABLE_OWNER, TABLE_NAME) might in fact be a synonym called ('PUBLIC', TABLE_NAME). The trick is thus to simply duplicate all input data as such:

SELECT 
  s.OWNER,
  s.SYNONYM_NAME,
  CONNECT_BY_ROOT s.TABLE_OWNER TABLE_OWNER,
  CONNECT_BY_ROOT s.TABLE_NAME  TABLE_NAME

-- Tweaked data set
FROM (
  SELECT OWNER, SYNONYM_NAME, TABLE_OWNER, TABLE_NAME
  FROM ALL_SYNONYMS
  UNION ALL
  SELECT OWNER, SYNONYM_NAME, 'PUBLIC', TABLE_NAME
  FROM ALL_SYNONYMS
) s

-- Add the synthetic PUBLIC TABLE_OWNER as well
WHERE      s.TABLE_OWNER IN (
  'PLAYGROUND', 'PUBLIC'
)
CONNECT BY s.TABLE_OWNER = PRIOR s.OWNER
AND        s.TABLE_NAME  = PRIOR s.SYNONYM_NAME
START WITH EXISTS (
  SELECT 1
  FROM   ALL_OBJECTS
  WHERE  s.TABLE_OWNER           = ALL_OBJECTS.OWNER
  AND    s.TABLE_NAME            = ALL_OBJECTS.OBJECT_NAME
  AND    ALL_OBJECTS.OWNER       = 'PLAYGROUND'
  AND    ALL_OBJECTS.OBJECT_TYPE <> 'SYNONYM'
)

There it is, our missing record!

OWNER       SYNONYM_NAME    TABLE_OWNER  TABLE_NAME
---------------------------------------------------
PLAYGROUND  MY_TABLE_OLD    PLAYGROUND   MY_TABLE
PLAYGROUND  MY_TABLE_BAK    PLAYGROUND   MY_TABLE
PUBLIC      MY_TABLE_BAK2   PLAYGROUND   MY_TABLE
PLAYGROUND  BAK_BACKUP_OLD  PLAYGROUND   MY_TABLE <-- Yep!

Displaying the hierarchy

There is also a quirky function called SYS_CONNECT_BY_PATH, which can be used to actually display the whole hierarchy in a string form (VARCHAR2, with max 4000 characters!). Here’s how:

SELECT 

-- Magic function
  SUBSTR(
    sys_connect_by_path(
         s.TABLE_OWNER
      || '.'
      || s.TABLE_NAME, ' <- '
    ) || ' <- '
      || s.OWNER
      || '.'
      || s.SYNONYM_NAME, 5
  )
FROM (
  SELECT OWNER, SYNONYM_NAME, TABLE_OWNER, TABLE_NAME
  FROM ALL_SYNONYMS
  UNION ALL
  SELECT OWNER, SYNONYM_NAME, 'PUBLIC', TABLE_NAME
  FROM ALL_SYNONYMS
) s
WHERE      s.TABLE_OWNER IN (
  'PLAYGROUND', 'PUBLIC'
)
CONNECT BY s.TABLE_OWNER = PRIOR s.OWNER
AND        s.TABLE_NAME  = PRIOR s.SYNONYM_NAME
START WITH EXISTS (
  SELECT 1
  FROM   ALL_OBJECTS
  WHERE  s.TABLE_OWNER           = ALL_OBJECTS.OWNER
  AND    s.TABLE_NAME            = ALL_OBJECTS.OBJECT_NAME
  AND    ALL_OBJECTS.OWNER       = 'PLAYGROUND'
  AND    ALL_OBJECTS.OBJECT_TYPE <> 'SYNONYM'
)

The above query will now output the following records:

PLAYGROUND.MY_TABLE <- PLAYGROUND.MY_TABLE_OLD
PLAYGROUND.MY_TABLE <- PLAYGROUND.MY_TABLE_OLD <- PLAYGROUND.MY_TABLE_BAK
PLAYGROUND.MY_TABLE <- PLAYGROUND.MY_TABLE_OLD <- PLAYGROUND.MY_TABLE_BAK <- PUBLIC.MY_TABLE_BAK2
PLAYGROUND.MY_TABLE <- PLAYGROUND.MY_TABLE_OLD <- PLAYGROUND.MY_TABLE_BAK <- PUBLIC.MY_TABLE_BAK2 <- PLAYGROUND.BAK_BACKUP_OLD

Impressive, eh?

Remark: In case you have stale synonyms

If you have “stale” synonyms, i.e. synonyms that point to nowhere, Oracle may report them to be pointing to themselves. That’s unfortunate and creates a CYCLE in CONNECT BY. To prevent this from happening, simply add another predicate like so:

SELECT 
  SUBSTR(
    sys_connect_by_path(
         s.TABLE_OWNER
      || '.'
      || s.TABLE_NAME, ' <- '
    ) || ' <- '
      || s.OWNER
      || '.'
      || s.SYNONYM_NAME, 5
  )
FROM (
  SELECT * FROM (
    SELECT OWNER, SYNONYM_NAME, TABLE_OWNER, TABLE_NAME
    FROM ALL_SYNONYMS
    UNION ALL
    SELECT OWNER, SYNONYM_NAME, 'PUBLIC', TABLE_NAME
    FROM ALL_SYNONYMS
  ) s

  -- Add this predicate to prevent cycles
  WHERE (s.OWNER       , s.SYNONYM_NAME)
    != ((s.TABLE_OWNER , s.TABLE_NAME))
) s
CONNECT BY s.TABLE_OWNER = PRIOR s.OWNER
AND        s.TABLE_NAME  = PRIOR s.SYNONYM_NAME
START WITH EXISTS (
  SELECT 1
  FROM   ALL_OBJECTS
  WHERE  s.TABLE_OWNER           = ALL_OBJECTS.OWNER
  AND    s.TABLE_NAME            = ALL_OBJECTS.OBJECT_NAME
  AND    ALL_OBJECTS.OWNER       = 'PLAYGROUND'
  AND    ALL_OBJECTS.OBJECT_TYPE <> 'SYNONYM'
)

Can the above query be written in jOOQ?

Yes of course. In jOOQ, pretty much everything is possible, if you can write it in SQL. Here’s how we use a query similar to the above to resolve Oracle Synonmys in the jOOQ code generator:

// Some reusable variables
AllObjects o   = ALL_OBJECTS;
AllSynonyms s1 = ALL_SYNONYMS;
AllSynonyms s2 = ALL_SYNONYMS.as("s2");
AllSynonyms s3 = ALL_SYNONYMS.as("s3");

Field<String> dot = inline(".");
String arr = " <- ";

// The actual qeury
DSL
.using(configuration)
.select(
  s3.OWNER,
  s3.SYNONYM_NAME,
  connectByRoot(s3.TABLE_OWNER).as("TABLE_OWNER"),
  connectByRoot(s3.TABLE_NAME).as("TABLE_NAME"),
  substring(
    sysConnectByPath(
      s3.TABLE_OWNER.concat(dot)
                    .concat(s3.TABLE_NAME), 
      arr
    )
    .concat(arr)
    .concat(s3.OWNER)
    .concat(dot)
    .concat(s3.SYNONYM_NAME), 
    5
  ))
.from(
  select()
  .from(
    select(
      s1.OWNER, s1.SYNONYM_NAME, 
      s1.TABLE_OWNER, s1.TABLE_NAME)
    .from(s1)
    .union(
    select(
      s1.OWNER, s1.SYNONYM_NAME, 
      inline("PUBLIC"), s1.TABLE_NAME)
    .from(s1))
    .asTable("s2"))
  .where(row(s2.OWNER, s2.SYNONYM_NAME)
         .ne(s2.TABLE_OWNER, s2.TABLE_NAME))
  .asTable("s3"))
.connectBy(s3.TABLE_OWNER.eq(prior(s3.OWNER)))
.and(s3.TABLE_NAME.eq(prior(s3.SYNONYM_NAME)))
.startWith(exists(
  selectOne()
  .from(o)
  .where(s3.TABLE_OWNER.eq(o.OWNER))
  .and(s3.TABLE_NAME.eq(o.OBJECT_NAME))
  .and(o.OBJECT_TYPE.ne("SYNONYM"))
  .and(o.OWNER.in(getInputSchemata()))
))
.fetch();

Download jOOQ today and try it yourself!

Conclusion

If you have an intrinsically hierarchical data set, then you will be very unhappy with these simplistic hierarchical SQL features (also with commont table expressions). They don’t perform very well, and they’re very hard to express if hierarchies get more complex. So you may as well consider using an actual graph database like Neo4j

But every now and then, a little hierarchy may sneak into your otherwise “standard” relational data model. When it does, be sure to have this useful CONNECT BY clause ready for action.

CONNECT BY is supported by (at least):

  • CUBRID
  • Informix
  • Oracle

Recursive common table expressions (the SQL standard’s counterpart for CONNECT BY are supported by (at least):

  • DB2
  • Firebird
  • HSQLDB
  • Oracle
  • PostgreSQL
  • SQL Server
  • Sybase SQL Anywhere

and…

  • H2 has some experimental support

In a future post, we’re going to be looking into how to do the same thing with recursive CTE.

jOOQ Newsletter: August 15, 2014 – jOOQ 3.5 Outlook


Subscribe to this newsletter here

jOOQ 3.5 Outlook

We’re working hard on the next release. Already 90 issues for jOOQ 3.5 are closed and counting! Today, we’re going to look at the highlights of what will be implemented in the next, exciting minor release, due for Q4 2014:

  • Support for new databases

    Our customers have been asking us for support of the Informix and Oracle TimesTen databases. While Informix is a very popular (and also old!) database, still widely used in the Industry, Oracle TimesTen is a promising new in-memory database with a very similar syntax to that of Oracle.

    With these two new additions, jOOQ will now support 18 RDBMS!

  • File-based code generation support

    This has been on our roadmap for a very long time, and finally we’re tackling it! If your development workflow prevents you from accessing a database during code generation, you can now also supply database meta information in XML format. We chose XML over any other format as it will be very easy to transform arbitrary pre-existing formats using XSLT (e.g. Hibernate hbm.xml, or ERD tools like Vertabelo‘s export format).

    We’re really looking forward to going live with this awesome feature, and in seeing a variety of community-contributed XSLT pop up, to help you integrate jOOQ with your favourite database schema definition format.

  • TypeProviders

    Sophisticated databases like PostgreSQL ship with a large variety of vendor-specific data types. It’s hard for jOOQ to support them all natively, but why add native support, when we can add another awesome SPI?

    TypeProviders will allow for abstracting over the “<T>” type, jOOQ’s column type. This will go far beyond data type conversion, it will allow you to specify how jOOQ will bind your user type to JDBC completely transparently.

These are just a few major features that we’ll be including in jOOQ 3.5, along with a lot of minor ones – so stay tuned for more jOOQ goodness.

Data Geekery Partner Network

Isaac Newton coined it like no one else:

If I have seen further it is by standing on ye sholders of Giants.

At Data Geekery, we’re looking into seeing further with jOOQ as well as we are now starting to offer and recommend services to the jOOQ ecosystem through our trusted integration partners. Today, we’re happy to recommend to you:


Germany based UWS Software Service (UWS) specialises in custom software development, application modernisation and outsourcing with a distinct focus on the Java Enterprise ecosystem.

UWS has successfully integrated the jOOQ Open-Source Edition with a variety of enterprise software projects. Their offering include custom jOOQ integrations into your system landscape and migration solutions from JDBC and/or JPA to jOOQ. UWS further offers development of custom enterprise applications using jOOQ.


“Almost every performance problem is caused by excessive use of ORM tools or improper indexing.”

Markus Winand specialises in these topics and provides SQL training and tuning services for developers. “It is difficult to tell Java developers to use SQL when Hibernate is not the right tool for a particular query” Winand said, and continued “JDBC is just too cumbersome and dangerous. jOOQ makes SQL in Java simple and safe—now I can show people how to get best of both worlds.”


 

Tweet of the Day

Our customers, users, and followers are sharing their love for jOOQ with the world and we can hardly catch up with them! Here are:

Majid Azimi, who is writing SQL like a boss with jOOQ

Christoph Henkelmann, who Has found the most awesome of all stacks to build great web applications. And that consists of Ninjaframework, jOOQ, BoneCP – Slim, Fast, Reliable. We couldn’t have said it any better, ourselves.

Nat Pryce, who simply loves doing SQL queries with jOOQ in Java 8.

Thanks for the shouts, guys! You make the jOOQ experience rock!

SQL Zone – Keyset Pagination

Markus Winand, author of Use The Index, Luke! has recently started a promotion against OFFSET pagination, in favour of keyset pagination, which he called #NoOffset.

We’ve blogged about this ourselves, before. Most people make use of OFFSET pagination because it is the default that is supported by almost all RDBMS.

In many cases, however, you do not need to paginate using OFFSETs, which can turn out to be very slow for large results and large offsets. Keyset pagination is very useful when you want to implement infinite scrolling, like Twitter, Facebook, etc.

jOOQ is one of the few APIs, and the only Java API that natively support keyset pagination.

SQL Zone – PIVOT your data

Every now and then, you have one of those fancy reporting problems where SQL just fits in perfectly. We’ve blogged about it: Are You Using PIVOT Yet?

With the Oracle and SQL Server PIVOT clause, it is very easy to flip rows and columns in a table. Imagine you have a table like this:

+------+----------------+-------------------+
| dnId |  propertyName  |   propertyValue   |
+------+----------------+-------------------+
|    1 | objectsid      | S-1-5-32-548      |
|    1 | _objectclass   | group             |
|    1 | cn             | Account Operators |
|    1 | samaccountname | Account Operators |
|    1 | name           | Account Operators |
|    2 | objectsid      | S-1-5-32-544      |
|    2 | _objectclass   | group             |
|    2 | cn             | Administrators    |
|    2 | samaccountname | Administrators    |
|    2 | name           | Administrators    |
|    3 | objectsid      | S-1-5-32-551      |
|    3 | _objectclass   | group             |
|    3 | cn             | Backup Operators  |
|    3 | samaccountname | Backup Operators  |
|    3 | name           | Backup Operators  |
+------+----------------+-------------------+

And now, you’d like to transform this table to the below:

+------+--------------+--------------+-------------------+-----
| dnId |  objectsid   | _objectclass |        cn         | ... 
+------+--------------+--------------+-------------------+-----
|    1 | S-1-5-32-548 | group        | Account Operators | ... 
|    2 | S-1-5-32-544 | group        | Administrators    | ... 
|    3 | S-1-5-32-551 | group        | Backup Operators  | ... 
+------+--------------+--------------+-------------------+-----

This is a piece of cake using PIVOT:

SELECT p.*
FROM (
  SELECT dnId, propertyName, propertyValue
  FROM myTable
) AS t
PIVOT(
  MAX(propertyValue)
  FOR propertyName IN (
    objectsid, 
    _objectclass, 
    cn, 
    samaccountname, 
    name
  )
) AS p;

jOOQ natively supports the PIVOT clause, which is definitely one of those tools to have on every reporting SQL developer’s tool chain. Read more about it here and here (original source on Stack Overflow).

A Wonderful SQL Feature: Quantified Comparison Predicates (ANY, ALL)


Have you ever wondered about the use-case behind SQL’s ANY (also: SOME) and ALL keywords?

You have probably not yet encountered these keywords in the wild. Yet they can be extremely useful. But first, let’s see how they’re defined in the SQL standard. The easy part:

8.7  <quantified comparison predicate>

Function

    Specify a quantified comparison.

Format

    <quantified comparison predicate> ::=
        <row value constructor> <comp op> 
            <quantifier> <table subquery>

    <quantifier> ::= <all> | <some>
    <all> ::= ALL
    <some> ::= SOME | ANY

Intuitively, such a quantified comparison predicate can be used as such:

-- Is any person of age 42?
42 = ANY (SELECT age FROM person)

-- Are all persons younger than 42?
42 > ALL (SELECT age FROM person)

Let’s keep with the useful ones. Observe that you have probably written the above queries with a different syntax, as such:

-- Is any person of age 42?
42 IN (SELECT age FROM person)

-- Are all persons younger than 42?
42 > (SELECT MAX(age) FROM person)

In fact, you’ve used the <in predicate>, or a greater than predicate with a <scalar subquery> and an aggregate function.

The IN predicate

It’s not a coincidence that you might have used the <in predicate> just like the above <quantified comparison predicate> using ANY. In fact, the <in predicate> is specified just like that:

8.4 <in predicate>

Syntax Rules

2) Let RVC be the <row value constructor> and let IPV 
   be the <in predicate value>.

3) The expression

     RVC NOT IN IPV

   is equivalent to

     NOT ( RVC IN IPV )

4) The expression

     RVC IN IPV

   is equivalent to

     RVC = ANY IPV

Precisely! Isn’t SQL beautiful? Note, the implicit consequences of 3) lead to a very peculiar behaviour of the NOT IN predicate with respect to NULL, which few developers are aware of.

Now, it’s getting awesome

So far, there is nothing out of the ordinary with these <quantified comparison predicate>. All of the previous examples can be emulated with “more idiomatic”, or let’s say, “more everyday” SQL.

But the true awesomeness of <quantified comparison predicate> appears only when used in combination with <row value expression> where rows have a degree / arity of more than one:

-- Is any person called "John" of age 42?
(42, 'John') = ANY (SELECT age, first_name FROM person)

-- Are all persons younger than 55?
-- Or if they're 55, do they all earn less than 150'000.00?
(55, 150000.00) > ALL (SELECT age, wage FROM person)

See the above queries in action on PostgreSQL in this SQLFiddle.

At this point, it is worth mentioning that few databases actually support…

  • row value expressions, or…
  • quantified comparison predicates with row value expressions

Even if specified in SQL-92, it looks as most databases still take their time to implement this feature 22 years later.

Emulating these predicates with jOOQ

But luckily, there is jOOQ to emulate these features for you. Even if you’re not using jOOQ in your project, the following SQL transformation steps can be useful if you want to express the above predicates. Let’s have a look at how this could be done in MySQL:

-- This predicate
(42, 'John') = ANY (SELECT age, first_name FROM person)

-- ... is the same as this:
EXISTS (
  SELECT 1 FROM person 
  WHERE age = 42 AND first_name = 'John'
)

What about the other predicate?

-- This predicate
(55, 150000.00) > ALL (SELECT age, wage FROM person)

-- ... is the same as these:
----------------------------
-- No quantified comparison predicate with
-- Row value expressions available
(55, 150000.00) > (
  SELECT age, wage FROM person
  ORDER BY 1 DESC, 2 DESC
  LIMIT 1
)

-- No row value expressions available at all
NOT EXISTS (
  SELECT 1 FROM person
  WHERE (55 < age)
  OR    (55 = age AND 150000.00 <= wage)
)

Clearly, the EXISTS predicate can be used in pretty much every database to emulate what we’ve seen before. If you just need this for a one-shot emulation, the above examples will be sufficient. If, however, you want to more formally use <row value expression> and <quantified comparison predicate>, you better get SQL transformation right.

Read on about SQL transformation in this article here.

The Difference Between ROW_NUMBER(), RANK(), and DENSE_RANK()


One of the best features in SQL are window functions. Dimitri Fontaine put it bluntly:

There was SQL before window functions and SQL after window functions

If you’re lucky enough to be using any of these databases, then you can use window functions yourself:

  • CUBRID
  • DB2
  • Firebird
  • Informix
  • Oracle
  • PostgreSQL
  • SQL Server
  • Sybase SQL Anywhere
  • Teradata

(source here)

One of the most obvious and useful set of window functions are ranking functions where rows from your result set are ranked according to a certain scheme. There are three ranking functions:

  • ROW_NUMBER()
  • RANK()
  • DENSE_RANK()

The difference is easy to remember. For the examples, let’s assume we have this table (using PostgreSQL syntax):

CREATE TABLE t(v) AS
SELECT * FROM (
  VALUES('a'),('a'),('a'),('b'),
        ('c'),('c'),('d'),('e')
) t(v)

ROW_NUMBER()

… assigns unique numbers to each row within the PARTITION given the ORDER BY clause. So you’d get:

SELECT v, ROW_NUMBER() OVER()
FROM t

Note that some SQL dialects (e.g. SQL Server) require an explicit ORDER BY clause in the OVER() clause:

SELECT v, ROW_NUMBER() OVER(ORDER BY v)
FROM t

The above query returns:

| V | ROW_NUMBER |
|---|------------|
| a |          1 |
| a |          2 |
| a |          3 |
| b |          4 |
| c |          5 |
| c |          6 |
| d |          7 |
| e |          8 |

(see also this SQLFiddle)

RANK()

… behaves like ROW_NUMBER(), except that “equal” rows are ranked the same. If we substitute RANK() into our previous query:

SELECT v, RANK() OVER(ORDER BY v)
FROM t

… then the result we’re getting is this:

| V | RANK |
|---|------|
| a |    1 |
| a |    1 |
| a |    1 |
| b |    4 |
| c |    5 |
| c |    5 |
| d |    7 |
| e |    8 |

(see also this SQLFiddle)

As you can see, much like in a sports ranking, we have gaps between the different ranks. We can avoid those gaps by using

DENSE_RANK()

Trivially, DENSE_RANK() is a rank with no gaps, i.e. it is “dense”. We can write:

SELECT v, DENSE_RANK() OVER(ORDER BY v)
FROM t

… to obtain

| V | DENSE_RANK |
|---|------------|
| a |          1 |
| a |          1 |
| a |          1 |
| b |          2 |
| c |          3 |
| c |          3 |
| d |          4 |
| e |          5 |

(see also this SQLFiddle)

One interesting aspect of DENSE_RANK() is the fact that it “behaves like” ROW_NUMBER() when we add the DISTINCT keyword.

SELECT DISTINCT v, DENSE_RANK() OVER(ORDER BY v)
FROM t

… to obtain

| V | DENSE_RANK |
|---|------------|
| a |          1 |
| b |          2 |
| e |          5 |
| d |          4 |
| c |          3 |

(see also this SQLFiddle)

In fact, ROW_NUMBER() prevents you from using DISTINCT, because ROW_NUMBER() generates unique values across the partition before DISTINCT is applied:

SELECT DISTINCT v, ROW_NUMBER() OVER(ORDER BY v)
FROM t
ORDER BY 1, 2

DISTINCT has no effect:

| V | ROW_NUMBER |
|---|------------|
| a |          1 |
| a |          2 |
| a |          3 |
| b |          4 |
| c |          5 |
| c |          6 |
| d |          7 |
| e |          8 |

(see also this SQLFiddle)

Putting it all together

A good way to understand the three ranking functions is to see them all in action side-by-side. Run this query

SELECT 
  v, 
  ROW_NUMBER() OVER(ORDER BY v),
  RANK()       OVER(ORDER BY v),
  DENSE_RANK() OVER(ORDER BY v)
FROM t
ORDER BY 1, 2

… or this one (using the SQL standard WINDOW clause, to reuse window specifications):

SELECT 
  v, 
  ROW_NUMBER() OVER(w),
  RANK()       OVER(w),
  DENSE_RANK() OVER(w)
FROM t
WINDOW w AS (ORDER BY v)

… to obtain:

| V | ROW_NUMBER | RANK | DENSE_RANK |
|---|------------|------|------------|
| a |          1 |    1 |          1 |
| a |          2 |    1 |          1 |
| a |          3 |    1 |          1 |
| b |          4 |    4 |          2 |
| c |          5 |    5 |          3 |
| c |          6 |    5 |          3 |
| d |          7 |    7 |          4 |
| e |          8 |    8 |          5 |

(see also this SQLFiddle)

Note that unfortunately, the WINDOW clause is not supported in all databases.

SQL is awesome

These things can be written very easily using SQL window functions. Once you get a hang of the syntax, you won’t want to miss this killer feature in your every day SQL statements any more. Excited?

jOOQ: The best way to use Oracle AQ in Java

For further reading, consider:

The “Free”, “Standard”, “Open” Software Heresy


There are those people that have a strong, dogmatic belief in what they call “Free” or “Standard” or “Open” software. One of those individuals is Jimmie (let’s call him Jimmie in this article) who has responded to an article about Java persistence by Marco Behler on TheServerSide.

Let me cite Jimmie’s response here:

JPA is difficult but complete. It has a learning curve, and you’ll have surprises if you try to shortcut its complexities. But they mostly are there for a reason. Difficult stuff is difficult using JPA, that’s true.

JOOQ is quick to learn. And is proprietary stuff. Not free. Only one implementation. No public review, only one body involved in its evolution. SQL-oriented, not OO (ok, they say it’s a feature).
As a serious professional, learn JPA. Fully. There is no excuse for not knowing which sql queries are generated in your production app. Replacing it with a more basic framework is no solution.

Let’s not go deeply into the concrete difference between JPA and jOOQ / SQL. That topic has been discussed already in lengths on Reddit. Let’s consider the essence of the comparison as perceived by Jimmie. Because, Jimmie would probably say exactly the same thing when comparing

  • JSF with Ext.JS or ZK
  • PostgreSQL with Oracle
  • MS Office or Google Docs (probably OK cause “gratis”) with LibreOffice
  • Linux with Windows or MacOSX (although he might perform some doublethink as a Mac user)

Software not being free

Jimmie, Is YOUR software free and “not proprietary”? If so, how do you finance it? How do you earn a living? And why are you doing it? What really motivates you? What really motivates your customers and why?

Only one implementation

How many people actually do use alternatives to Hibernate and why? Are they using EclipseLink mainly because they used to use TopLink for the last 20 years and the learning curve (or benefit) to switch to Hibernate is too high? How often do you actually switch implementations? What keeps you from implementing the jOOQ API, and open-source its implementation?

And most importantly: Do you always adhere to the JPA API, even if Hibernate has lots of awesome, proprietary extensions that just happen to work so much better / easier?

No public review

Who exactly is “public”, and what are their main interests? Did you know that one of the major driving force for the JDK is Credit Suisse, being a large customer for Oracle in the Java environment, for instance? What is your stake and relation with Credit Suisse as your “public” representative?

Only one body involved in its evolution

Do you say that to YOUR customers also, about your own software as well?

SQL-oriented vs “a serious professional”

What’s not serious about SQL? In fact, SQL is reviewed by more entities than the JLS, let alone the JPA specs. Have you ever thought about that?

More basic

Fair enough. But don’t forget: You probably replaced your sophisticated EJB 2.0 framework (still a standard!) from the early 2000’s by a more basic one, which was (at the time) proprietary, had only one implementation, had no public review, nor multiple bodies involved in its evolution. It was, at the time, called Hibernate. And let me take the opportunity to cite Gavin King (creator of Hibernate) about when to use Hibernate:

gavin-king-on-hibernate

My reply to you, Jimmie

According to you, JPA has to be learned fully. So I challenge you to also FULLY learn SQL, including all the SQL:2011 clauses, including

  • window functions
  • grouping sets
  • common table expressions
  • distinct/match/type/submultiset/unique predicates
  • time periods
  • partitioned outer joins
  • lateral joins
  • standard OFFSET pagination
  • contextually typed value specifications
  • quantified comparison predicates

… and of course all the details of interoperation between SQL and XQuery, one of the most popular aspects of the SQL:2011 standard!

And please, learn this FULLY, regardless of whether these things are part of your specific implementation. Because as a serious professional, you shall fully learn SQL. And while you’re at that, learn also everything about execution plans, and join, fetch, buffer caching, cursor caching and all other sorts of algorithms. Because there is no excuse for not knowing which SQL transformations are generated by your database’s CBO.

I know you like standards, Jimmie. But beware of the fact that there are some people out there who cannot wait for a standard to evolve to solve their problems. They may have more immediate problems. More specific problems. Simpler problems. Problems that might be solved only by proprietary software, so far. Or problems that are solved by proprietary software, that can be put into production with much less effort than your standards, Jimmie.

Lower time-to-market is what your customer might consider “professional”. Not whether this or that tech is used.

Someone always invents something proprietary at some time. It might just evolve into a standard. It might have been a bad idea and not evolve into anything. Or it might evolve into a standard and then be the worst standard ever. See again: EJB 2.0. I think we all agree on that, today.

No, Jimmie, the world isn’t black and white. It isn’t just about standards vs. proprietary. About free (libre) vs. commercial. About free (gratis) vs. “closed”. It’s about creating value for your customer.

Oh, and Jimmie. I sincerely hope you’re neither a Windows, nor a Mac user, because that wouldn’t be free, and there is only one implementation of each OS, and no public review, and only one body involved in their evolutions. And yet, the whole world runs on one of them.

Thanks for your attention, Jimmie.

SQL Tip of the Day: Be Wary of SELECT COUNT(*)


Recently, I’ve encountered this sort of query all over the place at a customer site:

DECLARE
  v_var NUMBER(10);
BEGIN
  SELECT COUNT(*)
  INTO   v_var
  FROM   table1
  JOIN   table2 ON table1.t1_id = table2.t1_id
  JOIN   table3 ON table2.t2_id = table3.t2_id
  ...
  WHERE  some_predicate;

  IF (v_var = 0) THEN
    do_something
  ELSE
    do_something_else
  END IF;
END;

Unfortunately, COUNT(*) is often the first solution that comes to mind when we want to check our relations for some predicate. But COUNT() is expensive, especially if all we’re doing is checking our relations for existence. Does the word ring a bell? Yes, we should use the EXISTS predicate, because if we don’t care about the exact number of records that return true for a given predicate, we shouldn’t go through the complete data set to actually count the exact number. The above PL/SQL block can be rewritten trivially to this one:

DECLARE
  v_var NUMBER(10);
BEGIN
  SELECT CASE WHEN EXISTS (
    SELECT 1
    FROM   table1
    JOIN   table2 ON table1.t1_id = table2.t1_id
    JOIN   table3 ON table2.t2_id = table3.t2_id
    ...
    WHERE  some_predicate
  ) THEN 1 ELSE 0 END
  INTO   v_var
  FROM   dual;

  IF (v_var = 0) THEN
    do_something
  ELSE
    do_something_else
  END IF;
END;

Let’s measure!

Query 1 yields this execution plan:

-----------------------------------------------
| Id  | Operation           | E-Rows | A-Rows |
-----------------------------------------------
|   0 | SELECT STATEMENT    |        |      1 |
|   1 |  SORT AGGREGATE     |      1 |      1 |
|*  2 |   HASH JOIN         |      4 |      4 |
|*  3 |    TABLE ACCESS FULL|      2 |      2 |
|*  4 |    TABLE ACCESS FULL|      6 |      6 |
-----------------------------------------------

Query 2 yields this execution plan:

----------------------------------------------
| Id  | Operation          | E-Rows | A-Rows |
----------------------------------------------
|   0 | SELECT STATEMENT   |        |      1 |
|   1 |  NESTED LOOPS      |      4 |      1 |
|*  2 |   TABLE ACCESS FULL|      2 |      1 |
|*  3 |   TABLE ACCESS FULL|      2 |      1 |
|   4 |  FAST DUAL         |      1 |      1 |
----------------------------------------------

You can ignore the TABLE ACCESS FULL operations, the actual query was executed on a trivial database with no indexes.

What’s essential, however, are the much improved E-Rows values (E = Estimated) and even more importantly the optimal A-Rows values (A = Actual). As you can see, the EXISTS predicate could be aborted early, as soon as the first record that matches the predicate is encountered – in this case immediately.

See this post about more details of how to collect Oracle Execution plans

Conclusion

Whenever you encounter a COUNT(*) operation, you should ask yourself if it is really needed. Do you really need to know the exact number of records that match a predicate? Or are you already happy knowing that any record matches the predicate?

Answer: It’s probably the latter.

Join the No OFFSET Movement!


Markus Winand from Use The Index, Luke! did it again. He started an exciting battle against one the biggest flaws in the SQL language:

No More OFFSET

We’ve blogged about this before. OFFSET pagination is terribly slow, once you reach higher page numbers. Besides, chances are, that your database doesn’t even implement it correctly, yet (and your emulation is probably wrong, too).

Join Markus’s movement for KEYSET pagination, which isn’t only much faster, but also more intuitive. Popular websites like Reddit, Twitter, Facebook and many more already implement keyset pagination. Why don’t you?

jOOQ is the only Java SQL API that already implements KEYSET pagination natively using the synthetic SEEK clause. Here’s how to do it:

DSL.using(configuration)
   .select(PLAYERS.PLAYER_ID,
           PLAYERS.FIRST_NAME,
           PLAYERS.LAST_NAME,
           PLAYERS.SCORE)
   .from(PLAYERS)
   .where(PLAYERS.GAME_ID.eq(42))
   .orderBy(PLAYERS.SCORE.desc(),
            PLAYERS.PLAYER_ID.asc())
   .seek(949, 15) // This jumps to the tuple (949, 15)
   .limit(10)
   .fetch();

Read more about this new movement here: http://use-the-index-luke.com/no-offset

Are You Using SQL PIVOT Yet? You Should!


Every once in a while, we run into these rare SQL issues where we’d like to do something that seems out of the ordinary. One of these things is pivoting rows to columns.

A recent question on Stack Overflow by Valiante asked for precisely this. Going from this table:

+------+------------+----------------+-------------------+
| dnId | propNameId |  propertyName  |   propertyValue   |
+------+------------+----------------+-------------------+
|    1 |         10 | objectsid      | S-1-5-32-548      |
|    1 |         19 | _objectclass   | group             |
|    1 |         80 | cn             | Account Operators |
|    1 |         82 | samaccountname | Account Operators |
|    1 |         85 | name           | Account Operators |
|    2 |         10 | objectsid      | S-1-5-32-544      |
|    2 |         19 | _objectclass   | group             |
|    2 |         80 | cn             | Administrators    |
|    2 |         82 | samaccountname | Administrators    |
|    2 |         85 | name           | Administrators    |
|    3 |         10 | objectsid      | S-1-5-32-551      |
|    3 |         19 | _objectclass   | group             |
|    3 |         80 | cn             | Backup Operators  |
|    3 |         82 | samaccountname | Backup Operators  |
|    3 |         85 | name           | Backup Operators  |
+------+------------+----------------+-------------------+

… we’d like to transform rows into colums as such:

+------+--------------+--------------+-------------------+-------------------+-------------------+
| dnId |  objectsid   | _objectclass |        cn         |  samaccountname   |       name        |
+------+--------------+--------------+-------------------+-------------------+-------------------+
|    1 | S-1-5-32-548 | group        | Account Operators | Account Operators | Account Operators |
|    2 | S-1-5-32-544 | group        | Administrators    | Administrators    | Administrators    |
|    3 | S-1-5-32-551 | group        | Backup Operators  | Backup Operators  | Backup Operators  |
+------+--------------+--------------+-------------------+-------------------+-------------------+

The idea is that we only want one row per distinct dnId, and then we’d like to transform the property-name-value pairs into columns, one column per property name.

Using Oracle or SQL Server PIVOT

The above transformation is actually quite easy with Oracle and SQL Server, which both support the PIVOT keyword on table expressions.

Here is how the desired result can be produced with SQL Server:

SELECT p.*
FROM (
  SELECT dnId, propertyName, propertyValue
  FROM myTable
) AS t
PIVOT(
  MAX(propertyValue)
  FOR propertyName IN (
    objectsid, 
    _objectclass, 
    cn, 
    samaccountname, 
    name
  )
) AS p;

(SQLFiddle here)

And the same query with a slightly different syntax in Oracle:

SELECT p.*
FROM (
  SELECT dnId, propertyName, propertyValue
  FROM myTable
) t
PIVOT(
  MAX(propertyValue)
  FOR propertyName IN (
    'objectsid'      as "objectsid", 
    '_objectclass'   as "_objectclass", 
    'cn'             as "cn", 
    'samaccountname' as "samaccountname", 
    'name'           as "name"
  )
) p;

(SQLFiddle here)

How does it work?

It is important to understand that PIVOT (much like JOIN) is a keyword that is applied to a table reference in order to transform it. In the above example, we’re essentially transforming the derived table t to form the pivot table p. We could take this further and join p to another derived table as so:

SELECT *
FROM (
  SELECT dnId, propertyName, propertyValue
  FROM myTable
) t
PIVOT(
  MAX(propertyValue)
  FOR propertyName IN (
    'objectsid'      as "objectsid", 
    '_objectclass'   as "_objectclass", 
    'cn'             as "cn", 
    'samaccountname' as "samaccountname", 
    'name'           as "name"
  )
) p
JOIN (
  SELECT dnId, COUNT(*) availableAttributes
  FROM myTable
  GROUP BY dnId
) q USING (dnId);

The above query will now allow for finding those rows for which there isn’t a name / value pair in every column. Let’s assume we remove one of the entries from the original table, the above query might now return:

| DNID |    OBJECTSID | _OBJECTCLASS |                CN |    SAMACCOUNTNAME |              NAME | AVAILABLEATTRIBUTES |
|------|--------------|--------------|-------------------|-------------------|-------------------|---------------------|
|    1 | S-1-5-32-548 |        group | Account Operators | Account Operators | Account Operators |                   5 |
|    2 | S-1-5-32-544 |        group |    Administrators |            (null) |    Administrators |                   4 |
|    3 | S-1-5-32-551 |        group |  Backup Operators |  Backup Operators |  Backup Operators |                   5 |

jOOQ also supports the SQL PIVOT clause through its API.

What if I don’t have PIVOT?

In simple PIVOT scenarios, users of other databases than Oracle or SQL Server can write an equivalent query that uses GROUP BY and MAX(CASE ...) expressions as documented in this answer here.

Follow

Get every new post delivered to your Inbox.

Join 1,858 other followers

%d bloggers like this: