Archive by Author | lukaseder

Why You Should NOT Implement Layered Architectures


Abstraction layers in software are what architecture astronauts tell you to do. Instead, however, half of all applications out there would be so easy, fun, and most importantly: productive to implement if you just got rid of all those layers.

Frankly, what do you really need? You need these two:

  • Some data access
  • Some UI

Because that’s the two things that you inevitably have in most systems. Users, and data. Here’s Kyle Boon’s opinion on possible choices that you may have

Very nice choice, Kyle. Ratpack and jOOQ. You could choose any other APIs, of course. You could even choose to write JDBC directly in JSP. Why not. As long as you don’t go pile up 13 layers of abstraction:

That’s all bollocks, you’re saying? We need layers to abstract away the underlying implementation so we can change it? OK, let’s give this some serious thought. How often do you really change the implementation? Some examples:

  • SQL. You hardly change the implementation from Oracle to DB2
  • DBMS. You hardly change the model from relational to flat or XML or JSON
  • JPA. You hardly switch from Hibernate to EclipseLink
  • UI. You simply don’t replace HTML with Swing
  • Transport. You just don’t switch from HTTP to SOAP
  • Transaction layer. You just don’t substitute JavaEE with Spring, or JDBC transactions

Nope. Your architecture is probably set in stone. And if – by the incredible influence of entropy and fate – you happen to have made the wrong decision in one aspect, about 3 years ago, well you’re in for a major refactoring anyway. If SQL was the wrong choice, well good luck to you migrating everything to MongoDB (which is per se the wrong choice again, so prepare for migrating back). If HTML was the wrong choice, well even more tough luck to you. Likelihood of your layers not really helping you when a concrete incident happens: 95% (because you missed an important detail)

Layers = Insurance

If you’re still thinking about implementing an extremely nice layered architecture, ready to deal with pretty much every situation where you simply switch a complete stack with another, then what you’re really doing is filing a dozen insurance policies. Think about it this way. You can get:

  • Legal insurance
  • Third party insurance
  • Reinsurance
  • Business interruption insurance
  • Business overhead expense disability insurance
  • Key person insurance
  • Shipping insurance
  • War risk insurance
  • Payment protection insurance
  • pick a random category

You can pay and pay and pay in advance for things that probably won’t ever happen to you. Will they? Yeah, they might. But if you buy all that insurance, you pay heavily up front. And let me tell you a secret. IF any incident ever happens, chances are that you:

  • Didn’t buy that particular insurance
  • Aren’t covered appropriately
  • Didn’t read the policy
  • Got screwed

And you’re doing exactly that in every application that would otherwise already be finished and would already be adding value to your customer, while you’re still debating if on layer 37 between the business rules and transformation layers, you actually need another abstraction because the rule engine could be switched any time.

Stop doing that

You get the point. If you have infinite amounts of time and money, implement an awesome, huge architecture up front.

Your competitor’s time to market (and fun, on the way) is better than yours. But for a short period of time, you were that close to the perfect, layered architecture!

When the Java 8 Streams API is not Enough


Java 8 was – as always – a release of compromises and backwards-compatibility. A release where the JSR-335 expert group might not have agreed upon scope or feasibility of certain features with some of the audience. See some concrete explanations by Brian Goetz about why …

But today we’re going to focus on the Streams API’s “short-comings”, or as Brian Goetz would probably put it: things out of scope given the design goals.

Parallel Streams?

Parallel computing is hard, and it used to be a pain. People didn’t exactly love the new (now old) Fork / Join API, when it was first shipped with Java 7. Conversely, and clearly, the conciseness of calling Stream.parallel() is unbeatable.

But many people don’t actually need parallel computing (not to be confused with multi-threading!). In 95% of all cases, people would have probably preferred a more powerful Streams API, or perhaps a generally more powerful Collections API with lots of awesome methods on various Iterable subtypes.

Changing Iterable is dangerous, though. Even a no-brainer as transforming an Iterable into a Stream via a potential Iterable.stream() method seems to risk opening pandora’s box!.

Sequential Streams!

So if the JDK doesn’t ship it, we create it ourselves!

Streams are quite awesome per se. They’re potentially infinite, and that’s a cool feature. Mostly – and especially with functional programming – the size of a collection doesn’t really matter that much, as we transform element by element using functions.

If we admit Streams to be purely sequential, then we could have any of these pretty cool methods as well (some of which would also be possible with parallel Streams):

  • cycle() – a guaranteed way to make every stream infinite
  • duplicate() – duplicate a stream into two equivalent streams
  • foldLeft() – a sequential and non-associative alternative to reduce()
  • foldRight() – a sequential and non-associative alternative to reduce()
  • limitUntil() – limit the stream to those records before the first one to satisfy a predicate
  • limitWhile() – limit the stream to those records before the first one not to satisfy a predicate
  • maxBy() – reduce the stream to the maximum mapped value
  • minBy() – reduce the stream to the minimum mapped value
  • partition() – partition a stream into two streams, one satisfying a predicate and the other not satisfying the same predicate
  • reverse() – produce a new stream in inverse order
  • skipUntil() – skip records until a predicate is satisified
  • skipWhile() – skip records as long as a predicate is satisfied
  • slice() – take a slice of the stream, i.e. combine skip() and limit()
  • splitAt() – split a stream into two streams at a given position
  • unzip() – split a stream of pairs into two streams
  • zip() – merge two streams into a single stream of pairs
  • zipWithIndex() – merge a stream with its corresponding stream of indexes into a single stream of pairs

jOOλ’s new Seq type does all that

All of the above is part of jOOλ. jOOλ (pronounced “jewel”, or “dju-lambda”, also written jOOL in URLs and such) is an ASL 2.0 licensed library that emerged from our own development needs when implementing jOOQ integration tests with Java 8. Java 8 is exceptionally well-suited for writing tests that reason about sets, tuples, records, and all things SQL.

But the Streams API just slightly feels insufficient, so we have wrapped JDK’s Streams into our own Seq type (Seq for sequence / sequential Stream):

// Wrap a stream in a sequence
Seq<Integer> seq1 = seq(Stream.of(1, 2, 3));

// Or create a sequence directly from values
Seq<Integer> seq2 = Seq.of(1, 2, 3);

We’ve made Seq a new interface that extends the JDK Stream interface, so you can use Seq fully interoperably with other Java APIs – leaving the existing methods unchanged:

public interface Seq<T> extends Stream<T> {

    /**
     * The underlying {@link Stream} implementation.
     */
    Stream<T> stream();
	
	// [...]
}

Now, functional programming is only half the fun if you don’t have tuples. Unfortunately, Java doesn’t have built-in tuples and while it is easy to create a tuple library using generics, tuples are still second-class syntactic citizens when comparing Java to Scala, for instance, or C# and even VB.NET.

Nonetheless…

jOOλ also has tuples

We’ve run a code-generator to produce tuples of degree 1-8 (we might add more in the future, e.g. to match Scala’s and jOOQ’s “magical” degree 22).

And if a library has such tuples, the library also needs corresponding functions. The essence of these TupleN and FunctionN types is summarised as follows:

public class Tuple3<T1, T2, T3>
implements 
    Tuple, 
	Comparable<Tuple3<T1, T2, T3>>, 
	Serializable, Cloneable {
    
    public final T1 v1;
    public final T2 v2;
    public final T3 v3;
	
	// [...]
}

and

@FunctionalInterface
public interface Function3<T1, T2, T3, R> {

    default R apply(Tuple3<T1, T2, T3> args) {
        return apply(args.v1, args.v2, args.v3);
    }

    R apply(T1 v1, T2 v2, T3 v3);
}

There are many more features in Tuple types, but let’s leave them out for today.

On a side note, I’ve recently had an interesting discussion with Gavin King (the creator of Hibernate) on reddit. From an ORM perspective, Java classes seem like a suitable implementation for SQL / relational tuples, and they are indeed. From an ORM perspective.

But classes and tuples are fundamentally different, which is a very subtle issue with most ORMs – e.g. as explained here by Vlad Mihalcea.

Besides, SQL’s notion of row value expressions (i.e. tuples) is quite different from what can be modelled with Java classes. This topic will be covered in a subsequent blog post.

Some jOOλ examples

With the aforementioned goals in mind, let’s see how the above API can be put to work by example:

zipping

// (tuple(1, "a"), tuple(2, "b"), tuple(3, "c"))
Seq.of(1, 2, 3).zip(Seq.of("a", "b", "c"));

// ("1:a", "2:b", "3:c")
Seq.of(1, 2, 3).zip(
    Seq.of("a", "b", "c"), 
    (x, y) -> x + ":" + y
);

// (tuple("a", 0), tuple("b", 1), tuple("c", 2))
Seq.of("a", "b", "c").zipWithIndex();

// tuple((1, 2, 3), (a, b, c))
Seq.unzip(Seq.of(
    tuple(1, "a"),
    tuple(2, "b"),
    tuple(3, "c")
));

This is already a case where tuples have become very handy. When we “zip” two streams into one, we want a wrapper value type that combines both values. Classically, people might’ve used Object[] for quick-and-dirty solutions, but an array doesn’t indicate attribute types or degree.

Unfortunately, the Java compiler cannot reason about the effective bound of the <T> type in Seq<T>. This is why we can only have a static unzip() method (instead of an instance one), whose signature looks like this:

// This works
static <T1, T2> Tuple2<Seq<T1>, Seq<T2>> 
    unzip(Stream<Tuple2<T1, T2>> stream) { ... }
	
// This doesn't work:
interface Seq<T> extends Stream<T> {
    Tuple2<Seq<???>, Seq<???>> unzip();
}

Skipping and limiting

// (3, 4, 5)
Seq.of(1, 2, 3, 4, 5).skipWhile(i -> i < 3);

// (3, 4, 5)
Seq.of(1, 2, 3, 4, 5).skipUntil(i -> i == 3);

// (1, 2)
Seq.of(1, 2, 3, 4, 5).limitWhile(i -> i < 3);

// (1, 2)
Seq.of(1, 2, 3, 4, 5).limitUntil(i -> i == 3);

Other functional libraries probably use different terms than skip (e.g. drop) and limit (e.g. take). It doesn’t really matter in the end. We opted for the terms that are already present in the existing Stream API: Stream.skip() and Stream.limit()

Folding

// "abc"
Seq.of("a", "b", "c").foldLeft("", (u, t) -> t + u);

// "cba"
Seq.of("a", "b", "c").foldRight("", (t, u) -> t + u);

The Stream.reduce() operations are designed for parallelisation. This means that the functions passed to it must have these important attributes:

But sometimes, you really want to “reduce” a stream with functions that do not have the above attributes, and consequently, you probably don’t care about the reduction being parallelisable. This is where “folding” comes in.

A nice explanation about the various differences between reducing and folding (in Scala) can be seen here.

Splitting

// tuple((1, 2, 3), (1, 2, 3))
Seq.of(1, 2, 3).duplicate();

// tuple((1, 3, 5), (2, 4, 6))
Seq.of(1, 2, 3, 4, 5, 6).partition(i -> i % 2 != 0)

// tuple((1, 2), (3, 4, 5))
Seq.of(1, 2, 3, 4, 5).splitAt(2);

The above functions all have one thing in common: They operate on a single stream in order to produce two new streams, that can be consumed independently.

Obviously, this means that internally, some memory must be consumed to keep buffers of partially consumed streams. E.g.

  • duplication needs to keep track of all values that have been consumed in one stream, but not in the other
  • partitioning needs to fast forward to the next value that satisfies (or doesn’t satisfy) the predicate, without losing all the dropped values
  • splitting might need to fast forward to the split index

For some real functional fun, let’s have a look at a possible splitAt() implementation:

static <T> Tuple2<Seq<T>, Seq<T>> 
splitAt(Stream<T> stream, long position) {
    return seq(stream)
          .zipWithIndex()
          .partition(t -> t.v2 < position)
          .map((v1, v2) -> tuple(
              v1.map(t -> t.v1),
              v2.map(t -> t.v1)
          ));
}

… or with comments:

static <T> Tuple2<Seq<T>, Seq<T>> 
splitAt(Stream<T> stream, long position) {
    // Add jOOλ functionality to the stream
    // -> local Type: Seq<T>
    return seq(stream)
	
    // Keep track of stream positions
    // with each element in the stream
    // -> local Type: Seq<Tuple2<T, Long>>
          .zipWithIndex()
	  
    // Split the streams at position
    // -> local Type: Tuple2<Seq<Tuple2<T, Long>>,
    //                       Seq<Tuple2<T, Long>>>
          .partition(t -> t.v2 < position)
		  
    // Remove the indexes from zipWithIndex again
    // -> local Type: Tuple2<Seq<T>, Seq<T>>
          .map((v1, v2) -> tuple(
              v1.map(t -> t.v1),
              v2.map(t -> t.v1)
          ));
}

Nice, isn’t it? A possible implementation for partition(), on the other hand, is a bit more complex. Here trivially with Iterator instead of the new Spliterator:

static <T> Tuple2<Seq<T>, Seq<T>> partition(
        Stream<T> stream, 
        Predicate<? super T> predicate
) {
    final Iterator<T> it = stream.iterator();
    final LinkedList<T> buffer1 = new LinkedList<>();
    final LinkedList<T> buffer2 = new LinkedList<>();

    class Partition implements Iterator<T> {

        final boolean b;

        Partition(boolean b) {
            this.b = b;
        }

        void fetch() {
            while (buffer(b).isEmpty() && it.hasNext()) {
                T next = it.next();
                buffer(predicate.test(next)).offer(next);
            }
        }

        LinkedList<T> buffer(boolean test) {
            return test ? buffer1 : buffer2;
        }

        @Override
        public boolean hasNext() {
            fetch();
            return !buffer(b).isEmpty();
        }

        @Override
        public T next() {
            return buffer(b).poll();
        }
    }

    return tuple(
        seq(new Partition(true)), 
        seq(new Partition(false))
    );
}

I’ll let you do the exercise and verify the above code.

Get and contribute to jOOλ, now!

All of the above is part of jOOλ, available for free from GitHub. There is already a partially Java-8-ready, full-blown library called functionaljava, which goes much further than jOOλ.

Yet, we believe that all what’s missing from Java 8’s Streams API is really just a couple of methods that are very useful for sequential streams.

In a previous post, we’ve shown how we can bring lambdas to String-based SQL using a simple wrapper for JDBC (of course, we still believe that you should use jOOQ instead).

Today, we’ve shown how we can write awesome functional and sequential Stream processing very easily, with jOOλ.

Stay tuned for even more jOOλ goodness in the near future (and pull requests are very welcome, of course!)

Look no Further! The Final Answer to “Where to Put Generated Code?”


This recent question on Stack Overflow made me think.

Why does jOOQ suggest to put generated code under “/target” and not under “/src”?

… and I’m about to give you the final answer to “Where to Put Generated Code?”

This isn’t only about jOOQ

Even if you’re not using jOOQ, or if you’re using jOOQ but without the code generator, there might be some generated source code in your project. There are many tools that generate source code from other data, such as:

  • The Java compiler (ok, byte code, not strictly source code. But still code generation)
  • XJC, from XSD files
  • Hibernate from .hbm.xml files, or from your schema
  • Xtend translates Xtend code to Java code
  • You could even consider data transformations, like XSLT
  • many more…

In this article, we’re going to look at how to deal with jOOQ-generated code, but the same thoughts apply also to any other type of code generated from other code or data.

Now, the very very interesting strategic question that we need to ask ourselves is: Where to put that code? Under version control, like the original data? Or should we consider generated code to be derived code that must be re-generated all the time?

The answer is nigh…

It depends!

Nope, unfortunately, as with many other flame-wary discussions, this one doesn’t have a completely correct or wrong answer, either. There are essentially two approaches:

Considering generated code as part of your code base

When you consider generated code as part of your code base, you will want to:

  • Check in generated sources in your version control system
  • Use manual source code generation
  • Possibly use even partial source code generation

This approach is particularly useful when your Java developers are not in full control of or do not have full access to your database schema (or your XSD or your Java code, etc.), or if you have many developers that work simultaneously on the same database schema, which changes all the time. It is also useful to be able to track side-effects of database changes, as your checked-in database schema can be considered when you want to analyse the history of your schema.

With this approach, you can also keep track of the change of behaviour in the jOOQ code generator, e.g. when upgrading jOOQ, or when modifying the code generation configuration.

When you use this approach, you will treat your generated code as an external library with its own lifecycle.

The drawback of this approach is that it is more error-prone and possibly a bit more work as the actual schema may go out of sync with the generated schema.

Considering generated code as derived artefacts

When you consider generated code to be derived artefacts, you will want to:

  • Check in only the actual DDL, i.e. the “original source of truth” (e.g. controlled via Flyway)
  • Regenerate jOOQ code every time the schema changes
  • Regenerate jOOQ code on every machine – including continuous integration machines, and possibly, if you’re crazy enough, on production

This approach is particularly useful when you have a smaller database schema that is under full control by your Java developers, who want to profit from the increased quality of being able to regenerate all derived artefacts in every step of your build.

This approach is fully supported by Maven, for instance, which foresees special directories (e.g. target/generated-sources), and phases (e.g. <phase>generate-sources</phase>) specifically for source code generation.

The drawback of this approach is that the build may break in perfectly “acceptable” situations, when parts of your database are temporarily unavailable.

Pragmatic approach

Some of you might not like that answer, but there is also a pragmatic approach, a combination of both. You can consider some code as part of your code base, and some code as derived. For instance, jOOQ-meta’s generated sources (used to query the dictionary views / INFORMATION_SCHEMA when generating jOOQ code) are put under version control as few jOOQ contributors will be able to run the jOOQ-meta code generator against all supported databases. But in many integration tests, we re-generate the sources every time to be sure the code generator works correctly.

Huh!

Conclusion

I’m sorry to disappoint you. There is no final answer to whether one approach or the other is better. Pick the one that offers you more value in your specific situation.

In case you’re choosing your generated code to be part of the code base, read this interesting experience report on the jOOQ User Group by Witold Szczerba about how to best achieve this.

Stop Manually Importing Your ERD Export into jOOQ


ERD (Entity Relationship Diagrams) are a great way of designing and visualising your database model. There is a variety of vendors offering free and commercial ERD tools. Vertabelo by E-Point is a SaaS product where you can design and manage your database schema online. For instance, the jOOQ example database can be modelled as such:

jOOQ Sample Database

jOOQ Sample Database

.

The most important aspect of such an ERD tool, however, is its import / export functionality. Not only can an existing schema be reverse-engineered, but you can also export it in SQL or XML format. This is great news for early adopters of the upcoming jOOQ 3.5, which will finally support importing file-based schema definitions, e.g. using an XML representation of your INFORMATION_SCHEMA.

XSLT to the rescue

In “modern” times where people put JSON everywhere and XML has become evil, people might have forgotten the power of XML and its tool chain. In this case, it will be very trivial to transform the Vertabelo export format into the jOOQ import format:

Export

<Tables>
    <Table Id="t1">
        <Name>LANGUAGE</Name>
        <Description></Description>
        <Columns>
            <Column Id="c1">
                <Name>ID</Name>
                <Type>integer</Type>
                <Nullable>false</Nullable>
                <PK>true</PK>
            </Column>
            <!-- ... -->

A full export file can be seen here.

Import

<information_schema>
    <schemata>
        <schema>
            <schema_name>PUBLIC</schema_name>
        </schema>
    </schemata>
    <tables>
        <table>
            <table_schema>PUBLIC</table_schema>
            <table_name>LANGUAGE</table_name>
        </table>
        <!-- ... -->
    </tables>
    <columns>
        <column>
            <table_schema>PUBLIC</table_schema>
            <table_name>LANGUAGE</table_name>
            <column_name>ID</column_name>
            <data_type>integer</data_type>
            <ordinal_position>1</ordinal_position>
            <is_nullable>false</is_nullable>
        </column>
        <!-- ... -->
    </columns>
    <table_constraints>
        <table_constraint>
            <constraint_schema>PUBLIC</constraint_schema>
            <constraint_name>PK_LANGUAGE</constraint_name>
            <constraint_type>PRIMARY KEY</constraint_type>
            <table_schema>PUBLIC</table_schema>
            <table_name>LANGUAGE</table_name>
        </table_constraint>
        <!-- ... -->

While the Vertabelo export format organised hierarchically, the jOOQ import format is a flat XML representation of the relevant SQL standard INFORMATION SCHEMA tables, as implemented by a variety of databases such as H2, HSQLDB, MySQL, PostgreSQL, SQL Server. The transformation can be done easily with XSLT, looking something like this:

<xsl:template match="/">
    <xsl:key name="schema" 
        match="/DatabaseModel/Tables/Table/Properties/Property[Name = 'Schema']" use="." />

    <information_schema xmlns="http://www.jooq.org/xsd/jooq-meta-3.5.0.xsd">
        <schemata>
            <xsl:apply-templates 
                select="/DatabaseModel/Tables/Table/Properties/Property[Name = 'Schema'][generate-id() = generate-id(key('schema', .)[1])]" 
                mode="schema"/>
        </schemata>
        
        <tables>
            <xsl:apply-templates 
                select="/DatabaseModel/Tables/Table" 
                mode="table"/>
        </tables>
        
        <columns>
            <xsl:apply-templates 
                select="/DatabaseModel/Tables/Table/Columns/Column" 
                mode="column"/>
        </columns>

<!-- ... -->
<xsl:template match="Table" mode="table">
    <table>
        <table_schema>
            <xsl:value-of select="Properties/Property[Name = 'Schema']/Value"/>
        </table_schema>
        <table_name>
            <xsl:value-of select="Name"/>
        </table_name>
    </table>
</xsl:template>

<xsl:template match="Column" mode="column">
    <xsl:variable name="Id" select="@Id"/>
    
    <column>
        <table_schema>
            <xsl:value-of select="ancestor::Table/Properties/Property[Name = 'Schema']/Value"/>
        </table_schema>
        <table_name>
            <xsl:value-of select="ancestor::Table/Name"/>
        </table_name>
        <column_name>
            <xsl:value-of select="Name"/>
        </column_name>
        <data_type>
            <xsl:value-of select="Type"/>
        </data_type>
        <ordinal_position>
            <xsl:value-of select="1 + count(preceding-sibling::Column)"/>
        </ordinal_position>
        <is_nullable>
            <xsl:value-of select="Nullable"/>
        </is_nullable>
    </column>
</xsl:template>

The full XSL file can be seen here.

Configuring Maven

All we need now is to put the Vertabelo export file somewhere in src/main/resources, and transform it with the Codehaus xml-maven-plugin like so:

<!-- The XSLT plugin transforming XML files prior to code generation -->
<plugin>
    <groupId>org.codehaus.mojo</groupId>
    <artifactId>xml-maven-plugin</artifactId>
    <executions>
        <execution>
            <phase>generate-sources</phase>
            <goals>
                <goal>transform</goal>
            </goals>
        </execution>
    </executions>
    <configuration>
        <transformationSets>
            <transformationSet>
                <dir>src/main/resources</dir>
                <includes>
                    <include>vertabelo-export.xml</include>
                </includes>
                <stylesheet>src/main/resources/vertabelo-2-jooq.xsl</stylesheet>
            </transformationSet>
        </transformationSets>
    </configuration>
</plugin>

The output is then available to the jOOQ code generator in the target directory:

<plugin>
    <groupId>org.jooq</groupId>
    <artifactId>jooq-codegen-maven</artifactId>
    <version>${org.jooq.version}</version>

    <executions>
        <execution>
            <id>generate-h2</id>
            <phase>generate-sources</phase>
            <goals>
                <goal>generate</goal>
            </goals>
            <configuration>
                <generator>
                    <name>org.jooq.util.DefaultGenerator</name>
                    <database>
                        <!-- We're using the new jOOQ 3.5 XMLDatabase here -->
                        <name>org.jooq.util.xml.XMLDatabase</name>
                        <properties>
                            <property>
                                <key>dialect</key>
                                <value>H2</value>
                            </property>
                            <property>
                                <key>xml-file</key>
                                <value>target/generated-resources/xml/xslt/vertabelo-export.xml</value>
                            </property>
                        </properties>
                        <inputSchema>PUBLIC</inputSchema>
                    </database>
                    <generate>
                        <deprecated>false</deprecated>
                        <instanceFields>true</instanceFields>
                    </generate>
                    <target>
                        <packageName>org.jooq.example.db.h2</packageName>
                        <directory>target/generated-sources/jooq-h2</directory>
                    </target>
                </generator>
            </configuration>
        </execution>
    </executions>
</plugin>

That’s it!

This obviously works with other ERD tools than Vertabelo as well. It should be just as easy to write your own XSL file for your own tool. All you need to do is generate valid XML according to the new http://www.jooq.org/xsd/jooq-meta-3.5.0.xsd schema.

See the pom.xml file here, for details, or download the full example project from GitHub.

Awesome SQL Trick: Constraints on Views


CHECK constraints are already pretty great when you want to sanitize your data. But there are some limitations to CHECK constraints, including the fact that they are applied to the table itself, when sometimes, you want to specify constraints that only apply in certain situations.

This can be done with the SQL standard WITH CHECK OPTION clause, which is implemented by at least Oracle and SQL Server. Here’s how to do that:

CREATE TABLE books (
  id    NUMBER(10)         NOT NULL,
  title VARCHAR2(100 CHAR) NOT NULL,
  price NUMBER(10, 2)      NOT NULL,
  
  CONSTRAINT pk_book PRIMARY KEY (id)
);
/

CREATE VIEW expensive_books
AS
SELECT id, title, price
FROM books
WHERE price > 100
WITH CHECK OPTION;
/

INSERT INTO books 
VALUES (1, '1984', 35.90);

INSERT INTO books 
VALUES (
  2, 
  'The Answer to Life, the Universe, and Everything',
  999.90
);

As you can see, expensive_books are all those books whose price is more than 100.00. This view will only report the second book:

SELECT * FROM expensive_books;

The above query yields:

ID TITLE                                       PRICE
-- ----------------------------------------- -------
 2 The Answer to Life, the Universe, and ...   999.9 

But now, that we have that CHECK OPTION, we can also prevent users from inserting “expensive books” that aren’t really expensive. For instance, let’s run this query:

INSERT INTO expensive_books 
VALUES (3, '10 Reasons why jOOQ is Awesome', 9.99);

This query won’t work now. We’re getting:

ORA-01402: view WITH CHECK OPTION where-clause violation

We also cannot update any of the “expensive books” to be non-expensive:

UPDATE expensive_books
SET price = 9.99;

This query results in the same ORA-01402 error message.

Inline WITH CHECK OPTION

In case you need to locally prevent bogus data from being inserted into a table, you can also use inline WITH CHECK OPTION clauses like so:

INSERT INTO (
  SELECT *
  FROM expensive_books
  WHERE price > 1000
  WITH CHECK OPTION
) really_expensive_books
VALUES (3, 'Modern Enterprise Software', 999.99);

And the above query again resutls in an ORA-01402 error.

Using SQL transformation to generate ad-hoc constraints

While CHECK OPTION is very useful for stored views, which can have proper grants for those users that may not access the underlying table directly, the inline CHECK OPTION is mainly useful when you transform dynamic SQL in an intermediate SQL transformation layer in your applciation.

This can be done with jOOQ’s SQL transformation capabilities, for instance, where you can watch out for a certain table in your SQL statements, and then centrally prevent bogus DML from being executed. This is a great way to implement multi-tenancy, if your database doesn’t natively support row-level security.

Stay tuned for a future blog post explaining how to transform your SQL with jOOQ to implement row-level security for any database.

jOOQ Newsletter: September 02, 2014 – Do You Really Need Support?


Do you really need support?

Our apologies. We hadn’t realised that we didn’t advertise the support-free jOOQ licenses, which we had been offering for quite a while now well enough on our website. So we have fixed that now.

We think that jOOQ is such a high quality, intuitive piece of software with a vibrant community that our customers might not even need us at Data Geekery to support them! That is why we have been offering support-less subscriptions where customers get to use the jOOQ Professional Edition or the jOOQ Enterprise Edition for 20% less than if they had our guaranteed reaction times.

All you need to do is enter the “NO SUPPORT” discount code with your next purchase, and start coding. More details here. Note that this will only remove our support guarantees, not the warranty. All upgrades and bugfixes are still included.

And while we’re at it, if you’re planning on purchasing 10 licenses or more, please contact us to learn about our high-volume tiered pricing model to further increase the value you’re getting out of jOOQ.

Data Geekery 1 Year Anniversary

Hooraay!

One year ago, on August 15 2013, Data Geekery GmbH was founded to provide commercial licensing and support for jOOQ. We’ve had exciting times behind us, and even more exciting times ahead of us. Here’s a quick wrap-up of what happend in the last year:

  • 2013-08-15: Data Geekery enters the Zurich trade register
  • 2013-10-09: jOOQ 3.2 is released under the new dual licensing strategy
  • 2013-10-29: jOOQ gets roughly 10% votes on this InfoQ poll
  • 2013-12-18: We’re having the 8th conference or JUG talk about jOOQ
  • 2014-12-31: Data Geekery is profitable. A Happy New Year, indeed!
  • 2014-01-01: Our monthly downloads have recovered from dual licensing
  • 2014-01-17: Our articles reach 1M reads on DZone
  • 2014-02-14: jOOQ 3.3 is released with Keyset pagination support
  • 2014-02-19: The 200th Stack Overflow question about jOOQ was asked
  • 2014-05-21: jOOQ is referenced from the RebelLabs reports
  • 2014-06-12: We’re having the 21st conference or JUG talk about jOOQ
  • 2014-06-20: jOOQ 3.4 is released with CTE, transactions, and DDL support
  • 2014-06-23: The 500th GitHub Star was added
  • 2014-07-01: Our monthly downloads have doubled compared to last year
  • 2014-08-08: The 400th blog post was published bringing the 650’000th hit

So, what’s next?

jOOQ is a big success story. Many minor frameworks by other “data geeks” copy jOOQ’s approach to writing internal domain-specific languages for a subset of SQL or of another query language. Examples are:

Being the industry’s leading type safe embedded SQL API, we’re going to continue pushing embedded SQL in Java, and SQL in general. Stay tuned for a very exciting second year of Data Geekery!

Tweet of the Day

Our customers, users, and followers are sharing their love for jOOQ with the world and we can hardly catch up with them! Here are:

Thanks for the shouts, guys! You make the jOOQ experience rock!

SQL Zone – The Dreaded COUNT(*) Function

COUNT(*) seems to be a practical way for many SQL developers to ensure that there is exactly one result record. No more, no less. But often, if you want exactly one record, you can achieve the same thing using a CASE expression along with anEXISTS predicate, which is likely to be much faster than the COUNT(*) alternative, because you probably don’t care about the exact number of records, only about the existence of such records.

Does that sound too abstract? Read this article here, and decide for yourself, if you find potential for optimisation in your code.

SQL Zone – Constraints on Views

If you’re using Oracle or SQL Server (or another standards-compliant database), you can put constraints (“CHECK OPTIONS”) on your database views. This can be extremely useful when you want to prevent users from inserting data into views that don’t match the view itself. Take this view for instance:

CREATE VIEW expensive_books
AS
SELECT id, title, price
FROM books
WHERE price > 100
WITH CHECK OPTION;

This view will not allow you to insert any books with a price lower than 100, because of the CHECK OPTION. An incredibly useful feature that will also be supported by the upcoming jOOQ 3.5.

Read this blog post for more information.

Upcoming Events

After a summer break, we’re back on the road!

Have you missed any of our previous jOOQ talks? Soon you’ll get another chance to hear us talk about jOOQ or SQL in general in any of these upcoming events:

Stay informed about 2014 events on www.jooq.org/news.

PL/SQL backtraces for debugging


For many PL/SQL developers, this might be common sense, but for one of our customers, this was an unknown PL/SQL feature: Backtraces.

When your application raises an error somewhere deep down in the call stack, you don’t get immediate information about the exact source of the error. For large PL/SQL applications, this can be a pain. One workaround is to keep track of statement numbers that were last executed before any error occurred:

DECLARE
  v_statement_no := 0;
BEGIN
  v_statement_no := 1;
  SELECT ...

  v_statement_no := 2;
  INSERT ...

  v_statement_no := 3;
  ...
EXCEPTION
  WHEN OTHERS THEN
    -- Log error message somewhere
    logger.error(module, v_statement_no, sqlerrm);
END;

The above looks an awful lot like println-debugging, a thing that isn’t really known to Java developers ;-)

But println-debugging isn’t necessary in PL/SQL either. Use the DBMS_UTILITY.FORMAT_ERROR_BACKTRACE function, instead! An example:

DECLARE
  PROCEDURE p4 IS BEGIN
    raise_application_error(-20000, 'Some Error');
  END p4;
  PROCEDURE p3 IS BEGIN
    p4;
  END p3;
  PROCEDURE p2 IS BEGIN
    p3;
  END p2;
  PROCEDURE p1 IS BEGIN
    p2;
  END p1;

BEGIN
  p1;
EXCEPTION
  WHEN OTHERS THEN
    dbms_output.put_line(sqlerrm);
    dbms_output.put_line(
      dbms_utility.format_error_backtrace
    );
END;
/

The above PL/SQL block generates the following output:

ORA-20000: Some Error
ORA-06512: at line 3
ORA-06512: at line 6
ORA-06512: at line 9
ORA-06512: at line 12
ORA-06512: at line 16

You can see exactly what line number generated the error. If you’re not using local procedures in anonymous blocks (which you quite likely aren’t), this gets even more useful:

CREATE PROCEDURE p4 IS BEGIN
  raise_application_error(-20000, 'Some Error');
END p4;
/
CREATE PROCEDURE p3 IS BEGIN
  p4;
END p3;
/
CREATE PROCEDURE p2 IS BEGIN
  p3;
END p2;
/
CREATE PROCEDURE p1 IS BEGIN
  p2;
END p1;
/

BEGIN
  p1;
EXCEPTION
  WHEN OTHERS THEN
    dbms_output.put_line(sqlerrm);
    dbms_output.put_line(
      dbms_utility.format_error_backtrace
    );
END;
/

The above now outputs:

ORA-20000: Some Error
ORA-06512: at "PLAYGROUND.P4", line 2
ORA-06512: at "PLAYGROUND.P3", line 2
ORA-06512: at "PLAYGROUND.P2", line 2
ORA-06512: at "PLAYGROUND.P1", line 2
ORA-06512: at line 2

To learn more about the DBMS_UTILITY package, please consider the manual. True to the nature of all things called “UTILITY”, it really contains pretty much random things that you wouldn’t expect there :-)

Integrating jOOQ with PostgreSQL: Partitioning


Introduction

jOOQ is a great framework when you want to work with SQL in Java without having too much ORM in your way. At the same time, it can be integrated into many environments as it is offering you support for many database-specific features. One such database-specific feature is partitioning in PostgreSQL. Partitioning in PostgreSQL is mainly used for performance reasons because it can improve query performance in certain situations. jOOQ has no explicit support for this feature but it can be integrated quite easily as we will show you.

This article is brought to you by the Germany based jOOQ integration partner UWS Software Service (UWS). UWS is specialised in custom software development, application modernisation and outsourcing with a distinct focus on the Java Enterprise ecosystem.

Partitioning in PostgreSQL

With the partitioning feature of PostgreSQL you have the possibility of splitting data that would form a huge table into multiple separate tables. Each of the partitions is a normal table which inherits its columns and constraints from a parent table. This so-called table inheritance can be used for “range partitioning” where, for example, the data from one range does not overlap the data from another range in terms of identifiers, dates or other criteria.

Like in the following example, you can have partitioning for a table “author” that shares the same foreign-key of a table “authorgroup” in all its rows.

CREATE TABLE author (
  authorgroup_id int,
  LastName varchar(255)
);

CREATE TABLE author_1 (
  CONSTRAINT authorgroup_id_check_1
    CHECK ((authorgroup_id = 1))
) INHERITS (author);

CREATE TABLE author_2 (
  CONSTRAINT authorgroup_id_check_2
    CHECK ((authorgroup_id = 2))
) INHERITS (author);

...

As you can see, we set up inheritance and – in order to have a simple example – we just put one constraint checking that the partitions have the same “authorgroup_id”. Basically, this results in the “author” table only containing table and column definitions, but no data. However, when querying the “author” table, PostgreSQL will really query all the inheriting “author_n” tables returning a combined result.

A trivial approach to using jOOQ with partitioning

In order to work with the partitioning described above, jOOQ offers several options. You can use the default way which is to let jOOQ generate one class per table. In order to insert data into multiple tables, you would have to use different classes. This approach is used in the following snippet:

// add
InsertQuery query1 = dsl.insertQuery(AUTHOR_1);
query1.addValue(AUTHOR_1.ID, 1);
query1.addValue(AUTHOR_1.LAST_NAME, "Nowak");
query1.execute();

InsertQuery query2 = dsl.insertQuery(AUTHOR_2);
query2.addValue(AUTHOR_2.ID, 1);
query2.addValue(AUTHOR_2.LAST_NAME, "Nowak");
query2.execute();

// select
Assert.assertTrue(dsl
    .selectFrom(AUTHOR_1)
    .where(AUTHOR_1.LAST_NAME.eq("Nowak"))
    .fetch().size() == 1);

Assert.assertTrue(dsl
    .selectFrom(AUTHOR_2)
    .where(AUTHOR_2.LAST_NAME.eq("Nowak"))
    .fetch().size() == 1);

You can see that multiple classes generated by jOOQ need to be used, so depending on how many partitions you have, generated classes can pollute your codebase. Also, imagine that you eventually need to iterate over partitions, which would be cumbersome to do with this approach. Another approach could be that you use jOOQ to build fields and tables using string manipulation but that is error prone again and prevents support for generic type safety. Also, consider the case where you want true data separation in terms of multi-tenancy.

You see that there are some considerations to do when working with partitioning. Fortunately jOOQ offers various ways of working with partitioned tables, and in the following we’ll compare approaches, so that you can choose the one most suitable for you.

Using jOOQ with partitioning and multi-tenancy

JOOQ’s runtime-schema mapping is often used to realize database environments, such that for example during development, one database is queried but when deployed to production, the queries are going to another database. Multi-tenancy is another recommended use case for runtime-schema mapping as it allows for strict partitioning and for configuring your application to only use databases or tables being configured in the runtime-schema mapping. So running the same code would result in working with different databases or tables depending on the configuration, which allows for true separation of data in terms of multi-tenancy.

The following configuration taken from the jOOQ documentation is executed when creating the DSLContext so it can be considered a system-wide setting:

Settings settings = new Settings()
  .withRenderMapping(new RenderMapping()
  .withSchemata(
      new MappedSchema().withInput("DEV")
                        .withOutput("MY_BOOK_WORLD")
                        .withTables(
      new MappedTable().withInput("AUTHOR")
                       .withOutput("AUTHOR_1"))));

// Add the settings to the Configuration
DSLContext create = DSL.using(
  connection, SQLDialect.ORACLE, settings);

// Run queries with the "mapped" configuration
create.selectFrom(AUTHOR).fetch();

// results in SQL:
// “SELECT * FROM MY_BOOK_WORLD.AUTHOR_1”

Using this approach you can map one table to one partition permanently eg. “AUTHOR” to “AUTHOR_1” for environment “DEV”. In another environment you could choose to map “AUTHOR” table to “AUTHOR_2”.

Runtime-schema mapping only allows you to map to exactly one table on a per-query basis, so you could not handle the use case where you would want to manipulate more than one table partition. If you would like to have more flexibility you might want to consider the next approach.

Using jOOQ with partitioning and without multi-tenancy

If you need to handle multiple table partitions without having multi-tenancy, you need a more flexible way of accessing partitions. The following example shows how you can do it in a dynamic and type safe way, avoiding errors and being usable in the same elegant way you are used to by jOOQ:

// add
for(int i=1; i<=2; i++) {
  Builder part = forPartition(i);
  InsertQuery query = dsl.insertQuery(part.table(AUTHOR));
  query.addValue(part.field(AUTHOR.ID), 1);
  query.addValue(part.field(AUTHOR.LAST_NAME), "Nowak");
  query.execute();
}

// select

for(int i=1; i<=2; i++) {
  Builder part = forPartition(i);
  Assert.assertTrue(dsl
      .selectFrom(part.table(AUTHOR))
      .where(part.field(AUTHOR.LAST_NAME).eq("Nowak"))
      .fetch()
      .size() == 1);
}

What you can see above is that the partition numbers are abstracted away so that you can use “AUTHOR” table instead of “AUTHOR_1”. Thus, your code won’t be polluted with many generated classes. Another thing is that the partitioner object is initialized dynamically so you can use it for example in a loop like above. Also it follows the Builder pattern so that you can operate on it like you are used to by jOOQ.

The code above is doing exactly the same as the first trivial snippet, but there are multiple benefits like type safe and reusable access to partitioned tables.

Integration of jOOQ partitioning without multi-tenancy into a Maven build process (optional)

If you are using Continuous-Integration you can integrate the solution above so that jOOQ is not generating tables for the partitioned tables. This can be achieved using a regular expression that excludes certain table names when generating Java classes. When using Maven, your integration might look something like this:

<generator>
  <name>org.jooq.util.DefaultGenerator</name>
  <database>
    <name>org.jooq.util.postgres.PostgresDatabase</name>
    <includes>.*</includes>
    <excludes>.*_[0-9]+</excludes>
    <inputSchema>${db.schema}</inputSchema>
  </database>
  <target>
    <packageName>com.your.company.jooq</packageName>
    <directory>target/generated-sources/jooq</directory>
  </target>
</generator>

Then it’s just calling mvn install and jOOQ maven plugin will be generating the database schema in compilation time.

Integrating jOOQ with PostgreSQL: Partitioning

This article described how jOOQ in combination with the partitioning feature of PostgreSQL can be used to implement multi-tenancy and improve database performance. PostgreSQL’s documentation states that for partitioning “the benefits will normally be worthwhile only when a table would otherwise be very large. The exact point at which a table will benefit from partitioning depends on the application, although a rule of thumb is that the size of the table should exceed the physical memory of the database server.”

Achieving support for partitioning with jOOQ is as easy as adding configuration or a small utility class, jOOQ is then able to support partitioning with or without multi-tenancy and without sacrificing type safety. Apart from Java-level integration, the described solution also smoothly integrates into your build and test process.

You may want to look at the sources of the partitioner utility class which also includes a test-class so that you can see the behavior and integration in more detail.

Please let us know if you need support for this or other jOOQ integrations within your environment. UWS Software Service (UWS) is an official jOOQ integration partner.

All You Ever Need to Know About Recursive SQL


Oracle SYNONYMs are a great feature. You can implement all sorts of backwards-compatibility tweaks simply by creating SYNONYMs in your database. Consider the following schema:

CREATE TABLE my_table (col NUMBER(7));

CREATE SYNONYM my_table_old FOR my_table;
CREATE SYNONYM my_table_bak FOR my_table_old;

Now you can query your same old table through three different names, it’ll all result in the same output:

SELECT * FROM my_table;

-- Same thing:
SELECT * FROM my_table_old;
SELECT * FROM my_table_bak;

The trouble is, when you see my_table_bak in code (or some even more obfuscated name), do you immediately know what it really is?

Use this query to find out

We can use the ALL_SYNONYMS table to figure this one out. This query will already give a simple overview:

SELECT *
FROM   ALL_SYNONYMS
WHERE  TABLE_OWNER = 'PLAYGROUND'

The output is:

OWNER       SYNONYM_NAME  TABLE_OWNER  TABLE_NAME
---------------------------------------------------
PLAYGROUND  MY_TABLE_BAK  PLAYGROUND   MY_TABLE_OLD
PLAYGROUND  MY_TABLE_OLD  PLAYGROUND   MY_TABLE

But as you can see, this is boring, because we have transitive synonyms in there and I don’t want to go through the complete table to figure out that MY_TABLE_BAK -> MY_TABLE_OLD -> MY_TABLE.

So let’s use CONNECT BY!

Oracle (as well as Informix and CUBRID) have this awesome CONNECT BY clause for hierarchical SQL. There is also the possibility to express hierarchical SQL using the more powerful common table expressions, if you dare.

But let’s see how we can transitively resolve our tables. Here’s how:

SELECT 
  s.OWNER,
  s.SYNONYM_NAME,

  -- Get to the root of the hierarchy
  CONNECT_BY_ROOT s.TABLE_OWNER TABLE_OWNER,
  CONNECT_BY_ROOT s.TABLE_NAME  TABLE_NAME
FROM       ALL_SYNONYMS s
WHERE      s.TABLE_OWNER = 'PLAYGROUND'

-- The magic CONNECT BY clause!
CONNECT BY s.TABLE_OWNER = PRIOR s.OWNER
AND        s.TABLE_NAME  = PRIOR s.SYNONYM_NAME

First off, there is CONNECT BY, which allows to “connect” hierarchies by their hierarchical predecessors. On each level of the hierarchy, we’ll connect the TABLE_NAME with its previous (“PRIOR”) SYNONYM_NAME. This will recurse as long as the chain doesn’t end (or if it runs into a cycle).

What’s also interesting is the CONNECT_BY_ROOT keyword, which, for each path through the hierarchy, displays the root of the path. In our case, that’s the target TABLE_NAME.

The output can be seen here:

OWNER       SYNONYM_NAME  TABLE_OWNER  TABLE_NAME
---------------------------------------------------
PLAYGROUND  MY_TABLE_OLD  PLAYGROUND   MY_TABLE
PLAYGROUND  MY_TABLE_BAK  PLAYGROUND   MY_TABLE
PLAYGROUND  MY_TABLE_BAK  PLAYGROUND   MY_TABLE_OLD <-- Useless

If you’re confused by the records that are displayed, just add the LEVEL pseudo-column to display the recursion level:

SELECT

  -- Add level here
  LEVEL,
  s.OWNER,
  s.SYNONYM_NAME,
  CONNECT_BY_ROOT s.TABLE_OWNER TABLE_OWNER,
  CONNECT_BY_ROOT s.TABLE_NAME  TABLE_NAME
FROM       ALL_SYNONYMS s
WHERE      s.TABLE_OWNER = 'PLAYGROUND'
CONNECT BY s.TABLE_OWNER = PRIOR s.OWNER
AND        s.TABLE_NAME  = PRIOR s.SYNONYM_NAME
LEVEL  OWNER       SYNONYM_NAME  TABLE_OWNER  TABLE_NAME
----------------------------------------------------------
1      PLAYGROUND  MY_TABLE_OLD  PLAYGROUND   MY_TABLE
2      PLAYGROUND  MY_TABLE_BAK  PLAYGROUND   MY_TABLE
1      PLAYGROUND  MY_TABLE_BAK  PLAYGROUND   MY_TABLE_OLD
^^^^^^
  Awesome!

Getting rid of “bad records” using START WITH

As you can see, some of the results are now synonyms pointing directly to the target table, whereas the last record still points to an intermediate element from the synonym path. This is because we’re recursing into the path hierarchies from every record in the table, also from the “intermediate” synonym references, whose TABLE_NAME is yet another synonym.

Let’s get rid of those as well, using the optional START WITH clause, which allows to limit tree traversals to those trees whose roots fulfil a given predicate:

SELECT 
  s.OWNER,
  s.SYNONYM_NAME,
  CONNECT_BY_ROOT s.TABLE_OWNER TABLE_OWNER,
  CONNECT_BY_ROOT s.TABLE_NAME  TABLE_NAME
FROM       ALL_SYNONYMS s
WHERE      s.TABLE_OWNER = 'PLAYGROUND'
CONNECT BY s.TABLE_OWNER = PRIOR s.OWNER
AND        s.TABLE_NAME  = PRIOR s.SYNONYM_NAME

-- Start recursing only from non-synonym objects
START WITH EXISTS (
  SELECT 1
  FROM   ALL_OBJECTS
  WHERE  s.TABLE_OWNER           = ALL_OBJECTS.OWNER
  AND    s.TABLE_NAME            = ALL_OBJECTS.OBJECT_NAME
  AND    ALL_OBJECTS.OWNER       = 'PLAYGROUND'
  AND    ALL_OBJECTS.OBJECT_TYPE <> 'SYNONYM'
)

So, essentially, we’re requiring the TABLE_NAME to be any object from ALL_OBJECTS that is in our schema, but not a SYNONYM. (yes, synonyms work for all objects, including procedures, packages, types, etc.)

Running the above query gets us the desired result:

OWNER       SYNONYM_NAME  TABLE_OWNER  TABLE_NAME
---------------------------------------------------
PLAYGROUND  MY_TABLE_OLD  PLAYGROUND   MY_TABLE
PLAYGROUND  MY_TABLE_BAK  PLAYGROUND   MY_TABLE

What about PUBLIC synonyms?

Most often, you will not use local synonyms, though, but PUBLIC ones. Oracle has this quirky PUBLIC pseudo-schema, in which you cannot create objects, but in which you can create synonyms. So, let’s create some more synonyms for backwards-compatibility purposes:

CREATE PUBLIC SYNONYM my_table_bak2 FOR my_table_bak;
CREATE SYNONYM bak_backup_old FOR my_table_bak2;

Unfortunately, this will break our chain, because for some reason only Oracle and the Oracle of Delphi knows, PUBLIC is well reported as a OWNER of the synonym, but not as the TABLE_OWNER. Let’s see some raw data with:

SELECT *
FROM   ALL_SYNONYMS
WHERE  TABLE_OWNER = 'PLAYGROUND'

… and thus:

OWNER       SYNONYM_NAME    TABLE_OWNER  TABLE_NAME
------------------------------------------------------
PLAYGROUND  MY_TABLE_OLD    PLAYGROUND   MY_TABLE
PLAYGROUND  MY_TABLE_BAK    PLAYGROUND   MY_TABLE_OLD
PUBLIC      MY_TABLE_BAK2   PLAYGROUND   MY_TABLE_BAK
PLAYGROUND  BAK_BACKUP_OLD  PLAYGROUND   MY_TABLE_BAK2 <-- Not PUBLIC

As you can see, the PUBLIC SYNONYM MY_TABLE_BAK2 is reported to be in the PLAYGROUND schema! This breaks recursion, of course. We’re missing a record:

OWNER       SYNONYM_NAME    TABLE_OWNER  TABLE_NAME
------------------------------------------------------
PLAYGROUND  MY_TABLE_OLD    PLAYGROUND   MY_TABLE
PLAYGROUND  MY_TABLE_BAK    PLAYGROUND   MY_TABLE
PUBLIC      MY_TABLE_BAK2   PLAYGROUND   MY_TABLE <-- Hmm?

In order to work around this issue, we’ll have to tweak our original data set. Any object reported as (TABLE_OWNER, TABLE_NAME) might in fact be a synonym called ('PUBLIC', TABLE_NAME). The trick is thus to simply duplicate all input data as such:

SELECT 
  s.OWNER,
  s.SYNONYM_NAME,
  CONNECT_BY_ROOT s.TABLE_OWNER TABLE_OWNER,
  CONNECT_BY_ROOT s.TABLE_NAME  TABLE_NAME

-- Tweaked data set
FROM (
  SELECT OWNER, SYNONYM_NAME, TABLE_OWNER, TABLE_NAME
  FROM ALL_SYNONYMS
  UNION ALL
  SELECT OWNER, SYNONYM_NAME, 'PUBLIC', TABLE_NAME
  FROM ALL_SYNONYMS
) s

-- Add the synthetic PUBLIC TABLE_OWNER as well
WHERE      s.TABLE_OWNER IN (
  'PLAYGROUND', 'PUBLIC'
)
CONNECT BY s.TABLE_OWNER = PRIOR s.OWNER
AND        s.TABLE_NAME  = PRIOR s.SYNONYM_NAME
START WITH EXISTS (
  SELECT 1
  FROM   ALL_OBJECTS
  WHERE  s.TABLE_OWNER           = ALL_OBJECTS.OWNER
  AND    s.TABLE_NAME            = ALL_OBJECTS.OBJECT_NAME
  AND    ALL_OBJECTS.OWNER       = 'PLAYGROUND'
  AND    ALL_OBJECTS.OBJECT_TYPE <> 'SYNONYM'
)

There it is, our missing record!

OWNER       SYNONYM_NAME    TABLE_OWNER  TABLE_NAME
---------------------------------------------------
PLAYGROUND  MY_TABLE_OLD    PLAYGROUND   MY_TABLE
PLAYGROUND  MY_TABLE_BAK    PLAYGROUND   MY_TABLE
PUBLIC      MY_TABLE_BAK2   PLAYGROUND   MY_TABLE
PLAYGROUND  BAK_BACKUP_OLD  PLAYGROUND   MY_TABLE <-- Yep!

Displaying the hierarchy

There is also a quirky function called SYS_CONNECT_BY_PATH, which can be used to actually display the whole hierarchy in a string form (VARCHAR2, with max 4000 characters!). Here’s how:

SELECT 

-- Magic function
  SUBSTR(
    sys_connect_by_path(
         s.TABLE_OWNER
      || '.'
      || s.TABLE_NAME, ' <- '
    ) || ' <- '
      || s.OWNER
      || '.'
      || s.SYNONYM_NAME, 5
  )
FROM (
  SELECT OWNER, SYNONYM_NAME, TABLE_OWNER, TABLE_NAME
  FROM ALL_SYNONYMS
  UNION ALL
  SELECT OWNER, SYNONYM_NAME, 'PUBLIC', TABLE_NAME
  FROM ALL_SYNONYMS
) s
WHERE      s.TABLE_OWNER IN (
  'PLAYGROUND', 'PUBLIC'
)
CONNECT BY s.TABLE_OWNER = PRIOR s.OWNER
AND        s.TABLE_NAME  = PRIOR s.SYNONYM_NAME
START WITH EXISTS (
  SELECT 1
  FROM   ALL_OBJECTS
  WHERE  s.TABLE_OWNER           = ALL_OBJECTS.OWNER
  AND    s.TABLE_NAME            = ALL_OBJECTS.OBJECT_NAME
  AND    ALL_OBJECTS.OWNER       = 'PLAYGROUND'
  AND    ALL_OBJECTS.OBJECT_TYPE <> 'SYNONYM'
)

The above query will now output the following records:

PLAYGROUND.MY_TABLE <- PLAYGROUND.MY_TABLE_OLD
PLAYGROUND.MY_TABLE <- PLAYGROUND.MY_TABLE_OLD <- PLAYGROUND.MY_TABLE_BAK
PLAYGROUND.MY_TABLE <- PLAYGROUND.MY_TABLE_OLD <- PLAYGROUND.MY_TABLE_BAK <- PUBLIC.MY_TABLE_BAK2
PLAYGROUND.MY_TABLE <- PLAYGROUND.MY_TABLE_OLD <- PLAYGROUND.MY_TABLE_BAK <- PUBLIC.MY_TABLE_BAK2 <- PLAYGROUND.BAK_BACKUP_OLD

Impressive, eh?

Remark: In case you have stale synonyms

If you have “stale” synonyms, i.e. synonyms that point to nowhere, Oracle may report them to be pointing to themselves. That’s unfortunate and creates a CYCLE in CONNECT BY. To prevent this from happening, simply add another predicate like so:

SELECT 
  SUBSTR(
    sys_connect_by_path(
         s.TABLE_OWNER
      || '.'
      || s.TABLE_NAME, ' <- '
    ) || ' <- '
      || s.OWNER
      || '.'
      || s.SYNONYM_NAME, 5
  )
FROM (
  SELECT * FROM (
    SELECT OWNER, SYNONYM_NAME, TABLE_OWNER, TABLE_NAME
    FROM ALL_SYNONYMS
    UNION ALL
    SELECT OWNER, SYNONYM_NAME, 'PUBLIC', TABLE_NAME
    FROM ALL_SYNONYMS
  ) s

  -- Add this predicate to prevent cycles
  WHERE (s.OWNER       , s.SYNONYM_NAME)
    != ((s.TABLE_OWNER , s.TABLE_NAME))
) s
CONNECT BY s.TABLE_OWNER = PRIOR s.OWNER
AND        s.TABLE_NAME  = PRIOR s.SYNONYM_NAME
START WITH EXISTS (
  SELECT 1
  FROM   ALL_OBJECTS
  WHERE  s.TABLE_OWNER           = ALL_OBJECTS.OWNER
  AND    s.TABLE_NAME            = ALL_OBJECTS.OBJECT_NAME
  AND    ALL_OBJECTS.OWNER       = 'PLAYGROUND'
  AND    ALL_OBJECTS.OBJECT_TYPE <> 'SYNONYM'
)

Can the above query be written in jOOQ?

Yes of course. In jOOQ, pretty much everything is possible, if you can write it in SQL. Here’s how we use a query similar to the above to resolve Oracle Synonmys in the jOOQ code generator:

// Some reusable variables
AllObjects o   = ALL_OBJECTS;
AllSynonyms s1 = ALL_SYNONYMS;
AllSynonyms s2 = ALL_SYNONYMS.as("s2");
AllSynonyms s3 = ALL_SYNONYMS.as("s3");

Field<String> dot = inline(".");
String arr = " <- ";

// The actual qeury
DSL
.using(configuration)
.select(
  s3.OWNER,
  s3.SYNONYM_NAME,
  connectByRoot(s3.TABLE_OWNER).as("TABLE_OWNER"),
  connectByRoot(s3.TABLE_NAME).as("TABLE_NAME"),
  substring(
    sysConnectByPath(
      s3.TABLE_OWNER.concat(dot)
                    .concat(s3.TABLE_NAME), 
      arr
    )
    .concat(arr)
    .concat(s3.OWNER)
    .concat(dot)
    .concat(s3.SYNONYM_NAME), 
    5
  ))
.from(
  select()
  .from(
    select(
      s1.OWNER, s1.SYNONYM_NAME, 
      s1.TABLE_OWNER, s1.TABLE_NAME)
    .from(s1)
    .union(
    select(
      s1.OWNER, s1.SYNONYM_NAME, 
      inline("PUBLIC"), s1.TABLE_NAME)
    .from(s1))
    .asTable("s2"))
  .where(row(s2.OWNER, s2.SYNONYM_NAME)
         .ne(s2.TABLE_OWNER, s2.TABLE_NAME))
  .asTable("s3"))
.connectBy(s3.TABLE_OWNER.eq(prior(s3.OWNER)))
.and(s3.TABLE_NAME.eq(prior(s3.SYNONYM_NAME)))
.startWith(exists(
  selectOne()
  .from(o)
  .where(s3.TABLE_OWNER.eq(o.OWNER))
  .and(s3.TABLE_NAME.eq(o.OBJECT_NAME))
  .and(o.OBJECT_TYPE.ne("SYNONYM"))
  .and(o.OWNER.in(getInputSchemata()))
))
.fetch();

Download jOOQ today and try it yourself!

Conclusion

If you have an intrinsically hierarchical data set, then you will be very unhappy with these simplistic hierarchical SQL features (also with commont table expressions). They don’t perform very well, and they’re very hard to express if hierarchies get more complex. So you may as well consider using an actual graph database like Neo4j

But every now and then, a little hierarchy may sneak into your otherwise “standard” relational data model. When it does, be sure to have this useful CONNECT BY clause ready for action.

CONNECT BY is supported by (at least):

  • CUBRID
  • Informix
  • Oracle

Recursive common table expressions (the SQL standard’s counterpart for CONNECT BY are supported by (at least):

  • DB2
  • Firebird
  • HSQLDB
  • Oracle
  • PostgreSQL
  • SQL Server
  • Sybase SQL Anywhere

and…

  • H2 has some experimental support

In a future post, we’re going to be looking into how to do the same thing with recursive CTE.

jOOQ Newsletter: August 15, 2014 – jOOQ 3.5 Outlook


Subscribe to this newsletter here

jOOQ 3.5 Outlook

We’re working hard on the next release. Already 90 issues for jOOQ 3.5 are closed and counting! Today, we’re going to look at the highlights of what will be implemented in the next, exciting minor release, due for Q4 2014:

  • Support for new databases

    Our customers have been asking us for support of the Informix and Oracle TimesTen databases. While Informix is a very popular (and also old!) database, still widely used in the Industry, Oracle TimesTen is a promising new in-memory database with a very similar syntax to that of Oracle.

    With these two new additions, jOOQ will now support 18 RDBMS!

  • File-based code generation support

    This has been on our roadmap for a very long time, and finally we’re tackling it! If your development workflow prevents you from accessing a database during code generation, you can now also supply database meta information in XML format. We chose XML over any other format as it will be very easy to transform arbitrary pre-existing formats using XSLT (e.g. Hibernate hbm.xml, or ERD tools like Vertabelo‘s export format).

    We’re really looking forward to going live with this awesome feature, and in seeing a variety of community-contributed XSLT pop up, to help you integrate jOOQ with your favourite database schema definition format.

  • TypeProviders

    Sophisticated databases like PostgreSQL ship with a large variety of vendor-specific data types. It’s hard for jOOQ to support them all natively, but why add native support, when we can add another awesome SPI?

    TypeProviders will allow for abstracting over the “<T>” type, jOOQ’s column type. This will go far beyond data type conversion, it will allow you to specify how jOOQ will bind your user type to JDBC completely transparently.

These are just a few major features that we’ll be including in jOOQ 3.5, along with a lot of minor ones – so stay tuned for more jOOQ goodness.

Data Geekery Partner Network

Isaac Newton coined it like no one else:

If I have seen further it is by standing on ye sholders of Giants.

At Data Geekery, we’re looking into seeing further with jOOQ as well as we are now starting to offer and recommend services to the jOOQ ecosystem through our trusted integration partners. Today, we’re happy to recommend to you:


Germany based UWS Software Service (UWS) specialises in custom software development, application modernisation and outsourcing with a distinct focus on the Java Enterprise ecosystem.

UWS has successfully integrated the jOOQ Open-Source Edition with a variety of enterprise software projects. Their offering include custom jOOQ integrations into your system landscape and migration solutions from JDBC and/or JPA to jOOQ. UWS further offers development of custom enterprise applications using jOOQ.


“Almost every performance problem is caused by excessive use of ORM tools or improper indexing.”

Markus Winand specialises in these topics and provides SQL training and tuning services for developers. “It is difficult to tell Java developers to use SQL when Hibernate is not the right tool for a particular query” Winand said, and continued “JDBC is just too cumbersome and dangerous. jOOQ makes SQL in Java simple and safe—now I can show people how to get best of both worlds.”


 

Tweet of the Day

Our customers, users, and followers are sharing their love for jOOQ with the world and we can hardly catch up with them! Here are:

Majid Azimi, who is writing SQL like a boss with jOOQ

Christoph Henkelmann, who Has found the most awesome of all stacks to build great web applications. And that consists of Ninjaframework, jOOQ, BoneCP – Slim, Fast, Reliable. We couldn’t have said it any better, ourselves.

Nat Pryce, who simply loves doing SQL queries with jOOQ in Java 8.

Thanks for the shouts, guys! You make the jOOQ experience rock!

SQL Zone – Keyset Pagination

Markus Winand, author of Use The Index, Luke! has recently started a promotion against OFFSET pagination, in favour of keyset pagination, which he called #NoOffset.

We’ve blogged about this ourselves, before. Most people make use of OFFSET pagination because it is the default that is supported by almost all RDBMS.

In many cases, however, you do not need to paginate using OFFSETs, which can turn out to be very slow for large results and large offsets. Keyset pagination is very useful when you want to implement infinite scrolling, like Twitter, Facebook, etc.

jOOQ is one of the few APIs, and the only Java API that natively support keyset pagination.

SQL Zone – PIVOT your data

Every now and then, you have one of those fancy reporting problems where SQL just fits in perfectly. We’ve blogged about it: Are You Using PIVOT Yet?

With the Oracle and SQL Server PIVOT clause, it is very easy to flip rows and columns in a table. Imagine you have a table like this:

+------+----------------+-------------------+
| dnId |  propertyName  |   propertyValue   |
+------+----------------+-------------------+
|    1 | objectsid      | S-1-5-32-548      |
|    1 | _objectclass   | group             |
|    1 | cn             | Account Operators |
|    1 | samaccountname | Account Operators |
|    1 | name           | Account Operators |
|    2 | objectsid      | S-1-5-32-544      |
|    2 | _objectclass   | group             |
|    2 | cn             | Administrators    |
|    2 | samaccountname | Administrators    |
|    2 | name           | Administrators    |
|    3 | objectsid      | S-1-5-32-551      |
|    3 | _objectclass   | group             |
|    3 | cn             | Backup Operators  |
|    3 | samaccountname | Backup Operators  |
|    3 | name           | Backup Operators  |
+------+----------------+-------------------+

And now, you’d like to transform this table to the below:

+------+--------------+--------------+-------------------+-----
| dnId |  objectsid   | _objectclass |        cn         | ... 
+------+--------------+--------------+-------------------+-----
|    1 | S-1-5-32-548 | group        | Account Operators | ... 
|    2 | S-1-5-32-544 | group        | Administrators    | ... 
|    3 | S-1-5-32-551 | group        | Backup Operators  | ... 
+------+--------------+--------------+-------------------+-----

This is a piece of cake using PIVOT:

SELECT p.*
FROM (
  SELECT dnId, propertyName, propertyValue
  FROM myTable
) AS t
PIVOT(
  MAX(propertyValue)
  FOR propertyName IN (
    objectsid, 
    _objectclass, 
    cn, 
    samaccountname, 
    name
  )
) AS p;

jOOQ natively supports the PIVOT clause, which is definitely one of those tools to have on every reporting SQL developer’s tool chain. Read more about it here and here (original source on Stack Overflow).

Follow

Get every new post delivered to your Inbox.

Join 1,955 other followers

%d bloggers like this: