An Ingenious Workaround to Emulate an Application of Union Types in Java

Before I move on with the actual article, I’d like to give credit to Daniel Dietrich, author of the awesome Javaslang library, who has had the idea before me:

Contravariant Generic Bounds

It all started with a tweet:

I wanted to do something like pattern-matching a common super type of a set of types, along the lines of:

<T super T1 | T2 | ... | TN>

Note that what I really wanted is support for union types, not intersection types as I originally claimed.

Why did I want to do that? Because it would be a nice addition to the jOOλ library, which features typesafe tuples for Java:

class Tuple3<T1, T2, T3> {
    final T1 v1;
    final T2 v2;
    final T3 v3;

    // Lots of useful stuff here
}

What would be nice in a tuple is something like a forEach() method that iterates over all attributes:

tuple(1, "a", null).forEach(System.out::println);

The above would simply yield:

1
a
null

Now, what would this forEach() method’s argument type be? It would have to look like this:

class Tuple3<T1, T2, T3> {
    void forEach(Consumer<? super T1 | T2 | T3> c) {}
}

The consumer would receive an object that is of type T1 or T2 or T3. But a consumer that accepts a common super type of the previous three types is OK as well. For example, if we have:

Tuple2<Integer, Long> tuple = tuple(1, 2L);
tuple.forEach(v -> 
    System.out.println(v.doubleValue()));

The above would compile, because Number (or, more formally, Number & Comparable<?> is a common super type of Integer and Long, and it contains a doubleValue() method.

As soon as we’re adding e.g. a String to the tuple, the following will no longer compile:

Tuple3<Integer, Long, String> tuple = 
    tuple(1, 2L, "A");

// Doesn't compile
tuple.forEach((Number v) -> 
    System.out.println(v.doubleValue()));

Unfortunately, this is not possible in Java

Java currently supports union types (see also algebraic data types) only for exception catch blocks, where you can write things like:

interface X {
    default void print() {}
}
class X1 extends RuntimeException implements X {}
class X2 extends RuntimeException implements X {}

// With the above
try {
    ...
}
catch (X1 | X2 e) {
    // This compiles for the same reasons!
    e.print();
}

But unfortunately, catch blocks are the only place in Java that allows for using properties of union types.

This is where Daniel’s clever and cunning workaround comes into play. We can write a static method that performs some “pattern-matching” (if you squint) using generics, the other way round:

static <
    T, 
    T1 extends T, 
    T2 extends T, 
    T3 extends T
> 
void forEach(
    Tuple3<T1, T2, T3> tuple, 
    Consumer<? super T> consumer
) {
    consumer.accept(tuple.v1);
    consumer.accept(tuple.v2);
    consumer.accept(tuple.v3);
}

The above can now be used typesafely to infer the common super type(s) of T1, T2, and T3:

Tuple2<Integer, Long> t = tuple(1, 2L);
forEach(t, c -> {
    System.out.println(c.doubleValue());
});

yielding, as expected:

1.0
2.0

It makes sense, because the generic type constraints are simply specified “the other way round”, i.e. when T1 extends T, forcibly, T super T1

If you squint really hard ;-)

This technique is supposedly used by Daniel in Javaslang’s upcoming pattern matching API. We’re looking forward to seeing that in action!

Did you like this article?

Read also the following ones:

How to Pattern-Match Files and Display Adjacent Lines in Java

Recently, we’ve published our article about the awesome window function support in jOOλ 0.9.9, which I believe is some of the best additions to the library that we’ve ever done.

Today, we’ll look into an awesome application of window functions in a use-case that is inspired by this Stack Overflow question Sean Nguyen:

How to get lines before and after matching from java 8 stream like grep?

I have a text files that have a lot of string lines in there. If I want to find lines before and after a matching in grep, I will do like this:

grep -A 10 -B 10 "ABC" myfile.txt

How can I implement the equivalent in java 8 using streams?

So the question is:

How can I implement the equivalent in Java 8 using streams?

jOOλ - The Missing Parts in Java 8 jOOλ improves the JDK libraries in areas where the Expert Group's focus was elsewhere.Well, the unix shell and its various “pipable” commands are about the only thing that are even more awesome (and mysterious) than window functions. Being able to grep for a certain string in a file, and then display a “window” of a couple of lines is quite useful.

With jOOλ 0.9.9, however, we can do that very easily in Java 8 as well. Consider this little snippet:

Seq.seq(Files.readAllLines(Paths.get(
        new File("/path/to/Example.java").toURI())))
   .window()
   .filter(w -> w.value().contains("ABC"))
   .forEach(w -> {
       System.out.println();
       System.out.println("-1:" + w.lag().orElse(""));
       System.out.println(" 0:" + w.value());
       System.out.println("+1:" + w.lead().orElse(""));
       // ABC: Just checking
   });

This program will output:

-1: .window()
 0: .filter(w -> w.value().contains("ABC"))
+1: .forEach(w -> {

-1:     System.out.println("+1:" + w.lead().orElse(""));
 0:     // ABC: Just checking
+1: });

So, I’ve run the program on itself and I’ve found all the lines that match “ABC”, plus the previous lines (“lagging” / lag()) and the following lines (leading / lead()). These lead() and lag() functions work just like their SQL equivalents.

But unlike SQL, composing functions in Java (or other general purpose languages) is a bit simpler as there is less syntax clutter involved. We can easily do aggregations over a window frame to collect a generic amount of lines “lagging” and “leading” a match. Consider the following alternative:

int lower = -5;
int upper =  5;
        
Seq.seq(Files.readAllLines(Paths.get(
        new File("/path/to/Example.java").toURI())))
   .window(lower, upper)
   .filter(w -> w.value().contains("ABC"))
   .map(w -> w.window()
              .zipWithIndex()
              .map(t -> tuple(t.v1, t.v2 + lower))
              .map(t -> (t.v2 > 0 
                       ? "+" 
                       : t.v2 == 0 
                       ? " " : "") 
                       + t.v2 + ":" + t.v1)
              .toString("\n"))

And the output that we’re getting is this:

-5:int upper =  5;
-4:        
-3:Seq.seq(Files.readAllLines(Paths.get(
-2:        new File("/path/to/Example.java").toURI())))
-1:   .window(lower, upper)
 0:   .filter(w -> w.value().contains("ABC"))
+1:   .map(w -> w.window()
+2:              .zipWithIndex()
+3:              .map(t -> tuple(t.v1, t.v2 + lower))
+4:              .map(t -> (t.v2 > 0 
+5:                       ? "+" 

Could it get any more concise? I don’t think so. Most of the logic above was just generating the index next to the line.

Conclusion

Window functions are extremely powerful. The recent discussion on reddit about our previous article on jOOλ’s window function support has shown that other languages also support primitives to build similar functionality. But usually, these building blocks aren’t as concise as the ones exposed in jOOλ, which are inspired by SQL.

With jOOλ mimicking SQL’s window functions, there is only little cognitive friction when composing powerful operations on in memory data streams.

Learn more about window functions in these articles here:

2016 Will be the Year Remembered as When Java Finally Had Window Functions!

You heard right. Up until now, the awesome window functions were a feature uniquely reserved to SQL. Even sophisticated functional programming languages still seem to lack this beautiful functionality (correct me if I’m wrong, Haskell folks).

We’ve written tons of blog posts about window functions, evangelising them to our audience, in articles like:

One of my favourite example use-cases for window functions is the running total. I.e. to get from the following bank account transaction table:

| ID   | VALUE_DATE | AMOUNT |
|------|------------|--------|
| 9997 | 2014-03-18 |  99.17 |
| 9981 | 2014-03-16 |  71.44 |
| 9979 | 2014-03-16 | -94.60 |
| 9977 | 2014-03-16 |  -6.96 |
| 9971 | 2014-03-15 | -65.95 |

… to this one, with a calculated balance:

| ID   | VALUE_DATE | AMOUNT |  BALANCE |
|------|------------|--------|----------|
| 9997 | 2014-03-18 |  99.17 | 19985.81 |
| 9981 | 2014-03-16 |  71.44 | 19886.64 |
| 9979 | 2014-03-16 | -94.60 | 19815.20 |
| 9977 | 2014-03-16 |  -6.96 | 19909.80 |
| 9971 | 2014-03-15 | -65.95 | 19916.76 |

With SQL, this is a piece of cake. Observe the usage of SUM(t.amount) OVER(...):

SELECT
  t.*,
  t.current_balance - NVL(
    SUM(t.amount) OVER (
      PARTITION BY t.account_id
      ORDER BY     t.value_date DESC,
                   t.id         DESC
      ROWS BETWEEN UNBOUNDED PRECEDING
           AND     1         PRECEDING
    ),
  0) AS balance
FROM     v_transactions t
WHERE    t.account_id = 1
ORDER BY t.value_date DESC,
         t.id         DESC

How do window functions work?

(don’t forget to book our SQL Masterclass to learn about window functions, and much more!)

Despite the sometimes a bit scary syntax, window functions are really very easy to understand. Windows are “views” of the data produced in your FROM / WHERE / GROUP BY / HAVING clauses. They allow you to access all the other rows relative to the current row, while you calculate something in your SELECT clause (or rarely, in your ORDER BY clause). What the above statement really does is this:

| ID   | VALUE_DATE |  AMOUNT |  BALANCE |
|------|------------|---------|----------|
| 9997 | 2014-03-18 | -(99.17)|+19985.81 |
| 9981 | 2014-03-16 | -(71.44)| 19886.64 |
| 9979 | 2014-03-16 |-(-94.60)| 19815.20 |
| 9977 | 2014-03-16 |   -6.96 |=19909.80 |
| 9971 | 2014-03-15 |  -65.95 | 19916.76 |

I.e. for any given balance, subtract from the current balance the SUM()OVER()” the window of all the rows that are in the same partition as the current row (same bank account), and that are strictly “above” the current row.

Or, in detail:

  • PARTITION BY specifies “OVER()” which rows the window spans
  • ORDER BY specifies how the window is ordered
  • ROWS specifies what ordered row indexes should be considered

Can we do this with Java collections?

jOOλ - The Missing Parts in Java 8 jOOλ improves the JDK libraries in areas where the Expert Group's focus was elsewhere.Yes we can! If you’re using jOOλ: A completely free Open Source, Apache 2.0 licensed library that we designed because we thought that the JDK 8 Stream and Collector APIs just don’t do it.

When Java 8 was designed, a lot of focus went into supporting parallel streams. That’s nice but certainly not the only useful area where functional programming can be applied. We’ve created jOOλ to fill this gap – without implementing an all new, alternative collections API, such as Javaslang or functional java have.

jOOλ already provides:

  1. Tuple types
  2. More useful stuff for ordered, sequential-only streams

With the recently released jOOλ 0.9.9, we’ve added two main new features:

  1. Tons of new Collectors
  2. Window functions

The many missing collectors in the JDK

The JDK ships with a couple of collectors, but they do seem awkward and verbose, and no one really appreciates writing collectors like the ones exposed in this Stack Overflow question (and many others).

But the use case exposed in the linked question is a very valid one. You want to aggregate several things from a list of person:

public class Person {
    private String firstName;
    private String lastName;
    private int age;
    private double height;
    private double weight;
    // getters / setters

Assuming you have this list:

List<Person> personsList = new ArrayList<Person>();

personsList.add(new Person("John", "Doe", 25, 1.80, 80));
personsList.add(new Person("Jane", "Doe", 30, 1.69, 60));
personsList.add(new Person("John", "Smith", 35, 174, 70));

You now want to get the following aggregations:

  • Number of people
  • Max age
  • Min height
  • Avg weight

This is a ridiculous problem for anyone used to writing SQL:

SELECT count(*), max(age), min(height), avg(weight)
FROM person

Done. How hard can it be in Java? It turns out that a lot of glue code needs to be written with vanilla JDK 8 API. Consider the sophisticated answers given

With jOOλ 0.9.9, solving this problem becomes ridiculously trivial again, and it reads almost like SQL:

Tuple result =
Seq.seq(personsList)
   .collect(
       count(),
       max(Person::getAge),
       min(Person::getHeight),
       avg(Person::getWeight)
   );

System.out.println(result);

And the result yields:

(3, Optional[35], Optional[1.69], Optional[70.0])

Note that this isn’t running a query against a SQL database (that’s what jOOQ is for). We’re running this “query” against an in-memory Java collection.

OK ok, that’s already awesome. Now what about window functions?

Right, the title of this article didn’t promise trivial aggregation stuff. It promised the awesome window functions.

Yet, window functions are nothing else than aggregations (or rankings) on a subset of your data stream. Instead of aggregating all of the stream (or table) into a single record, you want to maintain the original records, and provide the aggregation on each individual record directly.

A nice introductory example for window functions is the one provided in this article that explains the difference between ROW_NUMBER(), RANK(), and DENSE_RANK(). Consider the following PostgreSQL query:

SELECT
  v, 
  ROW_NUMBER() OVER(w),
  RANK()       OVER(w),
  DENSE_RANK() OVER(w)
FROM (
  VALUES('a'),('a'),('a'),('b'),
        ('c'),('c'),('d'),('e')
) t(v)
WINDOW w AS (ORDER BY v);

It yields:

| V | ROW_NUMBER | RANK | DENSE_RANK |
|---|------------|------|------------|
| a |          1 |    1 |          1 |
| a |          2 |    1 |          1 |
| a |          3 |    1 |          1 |
| b |          4 |    4 |          2 |
| c |          5 |    5 |          3 |
| c |          6 |    5 |          3 |
| d |          7 |    7 |          4 |
| e |          8 |    8 |          5 |

The same can be done in Java 8 using jOOλ 0.9.9

System.out.println(
    Seq.of("a", "a", "a", "b", "c", "c", "d", "e")
       .window(naturalOrder())
       .map(w -> tuple(
            w.value(),
            w.rowNumber(),
            w.rank(),
            w.denseRank()
       ))
       .format()
);

Yielding…

+----+----+----+----+
| v0 | v1 | v2 | v3 |
+----+----+----+----+
| a  |  0 |  0 |  0 |
| a  |  1 |  0 |  0 |
| a  |  2 |  0 |  0 |
| b  |  3 |  3 |  1 |
| c  |  4 |  4 |  2 |
| c  |  5 |  4 |  2 |
| d  |  6 |  6 |  3 |
| e  |  7 |  7 |  4 |
+----+----+----+----+

Again, do note that we’re not running any queries against a database. Everything is done in memory.

Notice two things:

  • jOOλ’s window functions return 0-based ranks, as is expected for Java APIs, as opposed to SQL, which is all 1-based.
  • In Java, it is not possible to construct ad-hoc records with named columns. That’s unfortunate, and I do hope a future Java will provide support for such language features.

Let’s review what happens exactly in the code:

System.out.println(

    // This is just enumerating our values
    Seq.of("a", "a", "a", "b", "c", "c", "d", "e")

    // Here, we specify a single window to be
    // ordered by the value T in the stream, in
    // natural order
       .window(naturalOrder())

    // The above window clause produces a Window<T>
    // object (the w here), which exposes...
       .map(w -> tuple(

    // ... the current value itself, of type String...
            w.value(),

    // ... or various rankings or aggregations on
    // the above window.
            w.rowNumber(),
            w.rank(),
            w.denseRank()
       ))

    // Just some nice formatting to produce the table
       .format()
);

That’s it! Easy, isn’t it?

We can do more! Check this out:

System.out.println(
    Seq.of("a", "a", "a", "b", "c", "c", "d", "e")
       .window(naturalOrder())
       .map(w -> tuple(
            w.value(),   // v0 
            w.count(),   // v1
            w.median(),  // v2
            w.lead(),    // v3
            w.lag(),     // v4
            w.toString() // v5
       ))
       .format()
);

What does the above yield?

+----+----+----+---------+---------+----------+
| v0 | v1 | v2 | v3      | v4      | v5       |
+----+----+----+---------+---------+----------+
| a  |  1 | a  | a       | {empty} | a        |
| a  |  2 | a  | a       | a       | aa       |
| a  |  3 | a  | b       | a       | aaa      |
| b  |  4 | a  | c       | a       | aaab     |
| c  |  5 | a  | c       | b       | aaabc    |
| c  |  6 | a  | d       | c       | aaabcc   |
| d  |  7 | b  | e       | c       | aaabccd  |
| e  |  8 | b  | {empty} | d       | aaabccde |
+----+----+----+---------+---------+----------+

Your analytics heart should be jumping, now.

4376565[1]

Wait a second. Can we do frames, too, as in SQL? Yes, we can. Just as in SQL, when we omit the frame clause on a window definition (but we do specify an ORDER BY clause), then the following is applied by default:

RANGE BETWEEN UNBOUNDED PRECEDING
  AND CURRENT ROW

We’ve done this in the previous examples. It can be seen in column v5, where we aggregate the string from the very first value up until the current value. So, let’s specify the frame then:

System.out.println(
    Seq.of("a", "a", "a", "b", "c", "c", "d", "e")
       .window(naturalOrder(), -1, 1) // frame here
       .map(w -> tuple(
            w.value(),   // v0
            w.count(),   // v1
            w.median(),  // v2
            w.lead(),    // v3
            w.lag(),     // v4
            w.toString() // v5
       ))
       .format()
);

And the result is, trivially:

+----+----+----+---------+---------+-----+
| v0 | v1 | v2 | v3      | v4      | v5  |
+----+----+----+---------+---------+-----+
| a  |  2 | a  | a       | {empty} | aa  |
| a  |  3 | a  | a       | a       | aaa |
| a  |  3 | a  | b       | a       | aab |
| b  |  3 | b  | c       | a       | abc |
| c  |  3 | c  | c       | b       | bcc |
| c  |  3 | c  | d       | c       | ccd |
| d  |  3 | d  | e       | c       | cde |
| e  |  2 | d  | {empty} | d       | de  |
+----+----+----+---------+---------+-----+

As expected, lead() and lag() are unaffected, as opposed to count(), median(), and toString()

Awesome! Now, let’s review the running total.

Often, you don’t calculate window functions on the scalar value of the stream itself, as that value is usually not a scalar value but a tuple (or a POJO in Java-speak). Instead, you extract values from the tuple (or POJO) and perform the aggregation on that. So, again, when calculating the BALANCE, we need to extract the AMOUNT first.

| ID   | VALUE_DATE |  AMOUNT |  BALANCE |
|------|------------|---------|----------|
| 9997 | 2014-03-18 | -(99.17)|+19985.81 |
| 9981 | 2014-03-16 | -(71.44)| 19886.64 |
| 9979 | 2014-03-16 |-(-94.60)| 19815.20 |
| 9977 | 2014-03-16 |   -6.96 |=19909.80 |
| 9971 | 2014-03-15 |  -65.95 | 19916.76 |

Here’s how you would write the running total with Java 8 and jOOλ 0.9.9

BigDecimal currentBalance = new BigDecimal("19985.81");

Seq.of(
    tuple(9997, "2014-03-18", new BigDecimal("99.17")),
    tuple(9981, "2014-03-16", new BigDecimal("71.44")),
    tuple(9979, "2014-03-16", new BigDecimal("-94.60")),
    tuple(9977, "2014-03-16", new BigDecimal("-6.96")),
    tuple(9971, "2014-03-15", new BigDecimal("-65.95")))
.window(Comparator
    .comparing((Tuple3<Integer, String, BigDecimal> t) 
        -> t.v1, reverseOrder())
    .thenComparing(t -> t.v2), Long.MIN_VALUE, -1)
.map(w -> w.value().concat(
     currentBalance.subtract(w.sum(t -> t.v3)
                              .orElse(BigDecimal.ZERO))
));

Yielding

+------+------------+--------+----------+
|   v0 | v1         |     v2 |       v3 |
+------+------------+--------+----------+
| 9997 | 2014-03-18 |  99.17 | 19985.81 |
| 9981 | 2014-03-16 |  71.44 | 19886.64 |
| 9979 | 2014-03-16 | -94.60 | 19815.20 |
| 9977 | 2014-03-16 |  -6.96 | 19909.80 |
| 9971 | 2014-03-15 | -65.95 | 19916.76 |
+------+------------+--------+----------+

A couple of things have changed here:

  • The comparator now takes two comparisons into account. Unforunately JEP-101 wasn’t entirely implemented, which is why we need to help the compiler with type inference here.
  • The Window.value() is now a tuple, not a single value. So we need to extract the interesting column from it, the AMOUNT (via t -> t.v3). On the other hand, we can simply concat() that additional value to the tuple

But that’s already it. Apart from the verbosity of the comparator (which we’ll certainly address in a future jOOλ version), writing a window function is a piece of cake.

What else can we do?

This article is not a complete description of all we can do with the new API. We’ll soon write a follow-up blog post with additional examples. For instance:

  • The partition by clause wasn’t described, but is available too
  • You can specify many more windows than the single window exposed here, each with individual PARTITION BY, ORDER BY and frame specifications

Also, the current implementation is rather canonical, i.e. it doesn’t (yet) cache aggregations:

  • For unordered / unframed windows (same value for all the partition)
  • Strictly ascendingly framed windows (aggregation can be based on previous value, for associative collectors like SUM(), or toString())

That’s it from our part. Download jOOλ, play around with it and enjoy the fact that the most awesome SQL feature is now available for all of you Java 8 developers!
https://github.com/jOOQ/jOOL

The Danger of Subtype Polymorphism Applied to Tuples

Java 8 has lambdas and streams, but no tuples, which is a shame. This is why we have implemented tuples in jOOλ – Java 8’s missing parts. Tuples are really boring value type containers. Essentially, they’re just an enumeration of types like these:

public class Tuple2<T1, T2> {
    public final T1 v1;
    public final T2 v2;

    public Tuple2(T1 v1, T2 v2) {
        this.v1 = v1;
        this.v2 = v2;
    }

    // [...]
}


public class Tuple3<T1, T2, T3> {
    public final T1 v1;
    public final T2 v2;
    public final T3 v3;

    public Tuple3(T1 v1, T2 v2, T3 v3) {
        this.v1 = v1;
        this.v2 = v2;
        this.v3 = v3;
    }

    // [...]
}

Writing tuple classes is a very boring task, and it’s best done using a source code generator.

Tuples in other languages and APIs

jOOλ‘s current version features tuples of degrees 0 – 16. C# and other .NET languages have tuple types between 1 – 8. There’s a special library just for tuples called Javatuples with tuples between degrees 1 and 10, and the authors went the extra mile and gave the tuples individual English names:

Unit<A> // (1 element)
Pair<A,B> // (2 elements)
Triplet<A,B,C> // (3 elements)
Quartet<A,B,C,D> // (4 elements)
Quintet<A,B,C,D,E> // (5 elements)
Sextet<A,B,C,D,E,F> // (6 elements)
Septet<A,B,C,D,E,F,G> // (7 elements)
Octet<A,B,C,D,E,F,G,H> // (8 elements)
Ennead<A,B,C,D,E,F,G,H,I> // (9 elements)
Decade<A,B,C,D,E,F,G,H,I,J> // (10 elements)

Why?

because Ennead really rings that sweet bell when I see it

Last, but not least, jOOQ also has a built-in tuple-like type, the org.jooq.Record, which serves as a base type for nice subtypes like Record7<T1, T2, T3, T4, T5, T6, T7>. jOOQ follows Scala and defines records up to a degree of 22.

Watch out when defining tuple type hierarchies

As we have seen in the previous example, Tuple3 has much code in common with Tuple2.

As we’re all massively brain-damaged by decades of object orientation and polymorphic design anti-patters, we might think that it would be a good idea to let Tuple3<T1, T2, T3> extend Tuple2<T1, T2>, as Tuple3 just adds one more attribute to the right of Tuple2, right? So…

public class Tuple3<T1, T2, T3> extends Tuple2<T1, T2> {
    public final T3 v3;

    public Tuple3(T1 v1, T2 v2, T3 v3) {
        super(v1, v2);
        this.v3 = v3;
    }

    // [...]
}

The truth is: That’s about the worst thing you could do, for several reasons. First off, yes. Both Tuple2 and Tuple3 are tuples, so they do have some common features. It’s not a bad idea to group those features in a common super type, such as:

public class Tuple2<T1, T2> implements Tuple {
    // [...]
}

But the degree is not one of those things. Here’s why:

Permutations

Think about all the possible tuples that you can form. If you let tuples extend each other, then a Tuple5 would also be assignment-compatible with a Tuple2, for instance. The following would compile perfectly:

Tuple2<String, Integer> t2 = tuple("A", 1, 2, 3, "B");

When letting Tuple3 extend Tuple2, it may have seemed like a good default choice to just drop the right-most attribute from the tuple in the extension chain.

But in the above example, why don’t I want to re-assign (v2, v4) such that the result is (1, 3), or maybe (v1, v3), such that the result is ("A", 2)?

There are a tremendous amount of permutations of possible attributes that could be of interest when “reducing” a higher degree tuple to a lower degree one. No way a default of dropping the right-most attribute will be sufficiently general for all use-cases

Type systems

Much worse than the above, there would be drastic implications for the type system, if Tuple3 extended Tuple2. Check out the jOOQ API, for instance. In jOOQ, you can safely assume the following:

// Compiles:
TABLE1.COL1.in(select(TABLE2.COL1).from(TABLE2))

// Must not compile:
TABLE1.COL1.in(select(TABLE2.COL1, TABLE2.COL2).from(TABLE2))

The first IN predicate is correct. The left hand side of the predicate has a single column (as opposed to being a row value expression). This means that the right hand side of the predicate must also operate on single-column expressions, e.g. a SELECT subquery that selects a single column (of the same type).

The second example selects too many columns, and the jOOQ API will tell the Java compiler that this is wrong.

This is guaranteed by jOOQ via the Field.in(Select) method, whose signature reads:

public interface Field<T> {
    ...
    Condition in(Select<? extends Record1<T>> select);
    ...
}

So, you can provide a SELECT statement that produces any subtype of the Record1<T> type.

Luckily, Record2 does not extend Record1

If now Record2 extended Record1, which might have seemed like a good idea at first, the second query would suddenly compile:

// This would now compile
TABLE1.COL1.in(select(TABLE2.COL1, TABLE2.COL2).from(TABLE2))

… even if it forms an invalid SQL statement. It would compile because it would generate a Select<Record2<Type1, Type2>> type, which would be a subtype of the expected Select<Record1<Type1>> from the Field.in(Select) method.

Conclusion

Tuple2 and Tuple5 types are fundamentally incompatible types. In strong type systems, you mustn’t be lured into thinking that similar types, or related types should also be compatible types.

Type hierarchies are something very object-oriented, and by object-oriented, I mean the flawed and over-engineered notion of object orientation that we’re still suffering from since the 90s. Even in “the Enterprise”, most people have learned to favour Composition over Inheritance. Composition in the case of tuples means that you can well transform a Tuple5 to a Tuple2. But you cannot assign it.

In jOOλ, such a transformation can be done very easily as follows:

// Produces (1, 3)
Tuple2<String, Integer> t2_4 = 
    tuple("A", 1, 2, 3, "B")
    .map((v1, v2, v3, v4, v5) -> tuple(v2, v4));

// Produces ("A", 2)
Tuple2<String, Integer> t1_3 = 
    tuple("A", 1, 2, 3, "B")
    .map((v1, v2, v3, v4, v5) -> tuple(v1, v3));

The idea is that you operate on immutable values, and that you can easily extract parts of those values and map / recombine them to new values.

Read more

If you’ve enjoyed reading this article, you might also like to learn why recursive generics are a terrible idea (in many situations).