Using jOOλ to Combine Several Java 8 Collectors into One

With Java 8 being mainstream now, people start using Streams for everything, even in cases where that’s a bit exaggerated (a.k.a. completely nuts, if you were expecting a hyperbole here). For instance, take mykong’s article here, showing how to collect a Map’s entry set stream into a list of keys and a list of values:

http://www.mkyong.com/java8/java-8-convert-map-to-list

The code posted on mykong.com does it in two steps:

package com.mkyong.example;

import java.util.HashMap;
import java.util.List;
import java.util.Map;
import java.util.stream.Collectors;

public class ConvertMapToList {
    public static void main(String[] args) {
        Map<Integer, String> map = new HashMap<>();
        map.put(10, "apple");
        map.put(20, "orange");
        map.put(30, "banana");
        map.put(40, "watermelon");
        map.put(50, "dragonfruit");

        System.out.println("\n1. Export Map Key to List...");

        List<Integer> result = map.entrySet().stream()
                .map(x -> x.getKey())
                .collect(Collectors.toList());

        result.forEach(System.out::println);

        System.out.println("\n2. Export Map Value to List...");

        List<String> result2 = map.entrySet().stream()
                .map(x -> x.getValue())
                .collect(Collectors.toList());

        result2.forEach(System.out::println);
    }
}

This is probably not what you should do in your own code. First off, if you’re OK with iterating the map twice, the simplest way to collect a map’s keys and values would be this:

List<Integer> result1 = new ArrayList<>(map.keySet());
List<String> result2 = new ArrayList<>(map.values());

There’s absolutely no need to resort to Java 8 streams for this particular example. The above is about as simple and speedy as it gets.

Don’t shoehorn Java 8 Streams into every problem

But if you really want to use streams, then I would personally prefer a solution where you do it in one go. There’s no need to iterate the Map twice in this particular case. For instance, you could do it by using jOOλ’s Tuple.collectors() method, a method that combines two collectors into a new collector that returns a tuple of the individual collections.

Code speaks for itself more clearly than the above description. Mykong.com’s code could be replaced by this:

Tuple2<List<Integer>, List<String>> result = 
map.entrySet()
    .stream()
    .collect(Tuple.collectors(
        Collectors.mapping(Entry::getKey, Collectors.toList()),
        Collectors.mapping(Entry::getValue, Collectors.toList())
    ));

The only jOOλ code put in place here is the call to Tuple.collectors(), which combines the standard JDK collectors that apply mapping on the Map entries before collecting keys and values into lists.

When printing the above result, you’ll get:

([50, 20, 40, 10, 30], [dragonfruit, orange, watermelon, apple, banana])

i.e. a tuple containing the two resulting lists.

Even simpler, don’t use the Java 8 Stream API, use jOOλ’s Seq (for sequential stream) and write this shorty instead:

Tuple2<List<Integer>, List<String>> result = 
Seq.seq(map)
   .collect(
        Collectors.mapping(Tuple2::v1, Collectors.toList()),
        Collectors.mapping(Tuple2::v2, Collectors.toList())
   );

Where Collectable.collect(Collector, Collector) provides awesome syntax sugar over the previous example

Convinced? Get jOOλ here: https://github.com/jOOQ/jOOL

Let’s Stream a Map in Java 8 with jOOλ

I wanted to find an easy way to stream a Map in Java 8. Guess what? There isn’t!

What I would’ve expected for convenience is the following method:

public interface Map<K, V> {

    default Stream<Entry<K, V>> stream() {
        return entrySet().stream();
    }    
}

But there’s no such method. There are probably a variety of reasons why such a method shouldn’t exist, e.g.:

  • There’s no “clear” preference for entrySet() being chosen over keySet() or values(), as a stream source
  • Map isn’t really a collection. It’s not even an Iterable
  • That wasn’t the design goal
  • The EG didn’t have enough time

Well, there is a very compelling reason for Map to have been retrofitted to provide both an entrySet().stream() and to finally implement Iterable<Entry<K, V>>. And that reason is the fact that we now have Map.forEach():

default void forEach(
        BiConsumer<? super K, ? super V> action) {
    Objects.requireNonNull(action);
    for (Map.Entry<K, V> entry : entrySet()) {
        K k;
        V v;
        try {
            k = entry.getKey();
            v = entry.getValue();
        } catch(IllegalStateException ise) {
            // this usually means the entry is no longer in the map.
            throw new ConcurrentModificationException(ise);
        }
        action.accept(k, v);
    }
}

forEach() in this case accepts a BiConsumer that really consumes entries in the map. If you search through JDK source code, there are really very few references to the BiConsumer type outside of Map.forEach() and perhaps a couple of CompletableFuture methods and a couple of streams collection methods.

So, one could almost assume that BiConsumer was strongly driven by the needs of this forEach() method, which would be a strong case for making Map.Entry a more important type throughout the collections API (we would have preferred the type Tuple2, of course).

Let’s continue this line of thought. There is also Iterable.forEach():

public interface Iterable<T> {
    default void forEach(Consumer<? super T> action) {
        Objects.requireNonNull(action);
        for (T t : this) {
            action.accept(t);
        }
    }
}

Both Map.forEach() and Iterable.forEach() intuitively iterate the “entries” of their respective collection model, although there is a subtle difference:

  • Iterable.forEach() expects a Consumer taking a single value
  • Map.forEach() expects a BiConsumer taking two values: the key and the value (NOT a Map.Entry!)

Think about it this way:

This makes the two methods incompatible in a “duck typing sense”, which makes the two types even more different

Bummer!

Improving Map with jOOλ

We find that quirky and counter-intuitive. forEach() is really not the only use-case of Map traversal and transformation. We’d love to have a Stream<Entry<K, V>>, or even better, a Stream<Tuple2<T1, T2>>. So we implemented that in jOOλ, a library which we’ve developed for our integration tests at jOOQ. With jOOλ, you can now wrap a Map in a Seq type (“Seq” for sequential stream, a stream with many more functional features):

Map<Integer, String> map = new LinkedHashMap<>();
map.put(1, "a");
map.put(2, "b");
map.put(3, "c");

assertEquals(
  Arrays.asList(
    tuple(1, "a"), 
    tuple(2, "b"), 
    tuple(3, "c")
  ),

  Seq.seq(map).toList()
);

What you can do with it? How about creating a new Map, swapping keys and values in one go:

System.out.println(
  Seq.seq(map)
     .map(Tuple2::swap)
     .toMap(Tuple2::v1, Tuple2::v2)
);

System.out.println(
  Seq.seq(map)
     .toMap(Tuple2::v2, Tuple2::v1)
);

Both of the above will yield:

{a=1, b=2, c=3}

Just for the record, here’s how to swap keys and values with standard JDK API:

System.out.println(
  map.entrySet()
     .stream()
     .collect(Collectors.toMap(
         Map.Entry::getValue, 
         Map.Entry::getKey
     ))
);

It can be done, but the every day verbosity of standard Java API makes things a bit hard to read / write

Java 8 Friday Goodies: Easy-as-Pie Local Caching

At Data Geekery, we love Java. And as we’re really into jOOQ’s fluent API and query DSL, we’re absolutely thrilled about what Java 8 will bring to our ecosystem. We have blogged a couple of times about some nice Java 8 goodies, and now we feel it’s time to start a new blog series, the…

Java 8 Friday

Every Friday, we’re showing you a couple of nice new tutorial-style Java 8 features, which take advantage of lambda expressions, extension methods, and other great stuff. You’ll find the source code on GitHub. tweet this

Java 8 Goodie: Easy-as-Pie Local Caching

Now get ready for one of the most awesome revelations in this series so far. We’ll show you an easy-as-pie way to implement a local cache using the good old HashMap and lambda expressions. Because, Map now has a new way of a atomically calculating a new value in case a key is absent. Perfect for caches. Let’s delve into code:

public static void main(String[] args) {
    for (int i = 0; i < 10; i++)
        System.out.println(
            "f(" + i + ") = " + fibonacci(i));
}

static int fibonacci(int i) {
    if (i == 0)
        return i;

    if (i == 1)
        return 1;

    System.out.println("Calculating f(" + i + ")");
    return fibonacci(i - 2) + fibonacci(i - 1);
}

Yes. That’s the naive way of doing things. Even for small numbers like fibonacci(5), the above algorithm will print out a huge amount of lines, as we’re repeating the same calculations exponentially:

Calculating f(6)
Calculating f(4)
Calculating f(2)
Calculating f(3)
Calculating f(2)
Calculating f(5)
Calculating f(3)
Calculating f(2)
Calculating f(4)
Calculating f(2)
Calculating f(3)
Calculating f(2)
f(6) = 8

What we want to do is build a cache of previously calculated fibonacci numbers. The most straightforward technique is to memoize all values in a cache. Here’s how we build a cache:

static Map<Integer, Integer> cache = new HashMap<>();

Important side note: A previous version of the blog post used a ConcurrentHashMap here. DO NOT USE ConcurrentHashMaps when you recursively calculate values using computeIfAbsent(). Details here:

http://stackoverflow.com/q/28840047/521799

Done! As mentioned before, we’re using the newly added Map.computeIfAbsent() method to calculate a new value from a source function only if we don’t already have a value for a given key. Caching! And since this method is guaranteed to execute atomically, and since we’re using a ConcurrentHashMap, this cache is even thread-safe without resorting to manually applying synchronized anywhere. And it can be reused for stuff other than calculating fibonacci numbers. But let’s first apply this cache to our fibonacci() function.

static int fibonacci(int i) {
    if (i == 0)
        return i;

    if (i == 1)
        return 1;

    return cache.computeIfAbsent(i, (key) ->
                 fibonacci(i - 2)
               + fibonacci(i - 1));
}

That’s it. It can’t get any simpler than this! Want proof? We’ll log a message on the console every time we actually evaluate a new value:

static int fibonacci(int i) {
    if (i == 0)
        return i;

    if (i == 1)
        return 1;

    return cache.computeIfAbsent(i, (key) -> {
        System.out.println(
            "Slow calculation of " + key);

        return fibonacci(i - 2) + fibonacci(i - 1);
    });
}

The above program will print

f(0) = 0
f(1) = 1
Slow calculation of 2
f(2) = 1
Slow calculation of 3
f(3) = 2
Slow calculation of 4
f(4) = 3
Slow calculation of 5
f(5) = 5
Slow calculation of 6
f(6) = 8
Slow calculation of 7
f(7) = 13
Slow calculation of 8
f(8) = 21
Slow calculation of 9
f(9) = 34

How would we have done it in Java 7?

Good question. With lots of code. We’d probably write something like this using double-checked locking:

static int fibonacciJava7(int i) {
    if (i == 0)
        return i;

    if (i == 1)
        return 1;

    Integer result = cache.get(i);
    if (result == null) {
        synchronized (cache) {
            result = cache.get(i);

            if (result == null) {
                System.out.println(
                    "Slow calculation of " + i);

                result = fibonacci(i - 2) 
                       + fibonacci(i - 1);
                cache.put(i, result);
            }
        }
    }

    return result;
}

Convinced?

Note, that your actual solution would probably make use of Guava Caches.

Conclusion

Lambdas are only one part of Java 8. An important part, but let’s not forget all the new features that were added to the libraries and that can be leveraged with lambdas now!

This is really exciting and…

We can greatly improve our code bases without resorting to new libraries. All of the above can be run with the JDK’s libraries only. tweet this

Next week in this blog series, we’re going to look at how Java 8 will improve on the existing and new concurrency APIs, so stay tuned!

More on Java 8

In the mean time, have a look at Eugen Paraschiv’s awesome Java 8 resources page

Java 8 Friday Goodies: Map Enhancements

At Data Geekery, we love Java. And as we’re really into jOOQ’s fluent API and query DSL, we’re absolutely thrilled about what Java 8 will bring to our ecosystem. We have blogged a couple of times about some nice Java 8 goodies, and now we feel it’s time to start a new blog series, the…

Java 8 Friday

Every Friday, we’re showing you a couple of nice new tutorial-style Java 8 features, which take advantage of lambda expressions, extension methods, and other great stuff. You’ll find the source code on GitHub. tweet this

Java 8 Goodie: Map Enhancements

In previous posts, we’ve already dealt with a couple of new Streams features, for instance when sorting. Most API improvements are indeed part of the new Streams API. But a few nice methods were also added to java.util.List and most importantly, to java.util.Map. If you want a quick overview of feature additions, go to the JDK8 Javadoc and click on the new “Default Methods” tab:

java.util.Map default methods

java.util.Map default methods

For backwards-compatibility reasons, all new methods added to Java interfaces are in fact default methods. So we have a couple of exciting new additions!

compute() methods

Often, we fetch a value from a map, make some calculations on it and put it back into the map. This can be verbose and hard to get right if concurrency is involved. With Java 8, we can pass a BiFunction to the new compute(), computeIfAbsent(), or computeIfPresent() methods and have the Map implementation handle the semantics of replacing a value.

The following example shows how this works:

// We'll be using this simple map
// Unfortunately, still no map literals in Java 8..
Map<String, Integer> map = new HashMap<>();
map.put("A", 1);
map.put("B", 2);
map.put("C", 3);

// Compute a new value for the existing key
System.out.println(map.compute("A", 
    (k, v) -> v == null ? 42 : v + 41));
System.out.println(map);

// This will add a new (key, value) pair
System.out.println(map.compute("X", 
    (k, v) -> v == null ? 42 : v + 41));
System.out.println(map);

The output of the above program is this:

42
{A=42, B=2, C=3}
42
{A=42, B=2, C=3, X=42}

This is really useful for ConcurrentHashMap, which ships with the following guarantee:

The entire method invocation is performed atomically. Some attempted update operations on this map by other threads may be blocked while computation is in progress, so the computation should be short and simple, and must not attempt to update any other mappings of this Map.

forEach() method

This is a really nice goodie which lets you pass a method reference or a lambda to receive (key, value) pairs one by one. A trivial example would be this:

map.forEach((k, v) -> 
    System.out.println(k + "=" + v));

Its output being:

A=1
B=2
C=3

merge() method

Now this one is really not so easy to understand. The Javadoc uses this example here:

map.merge(key, msg, String::concat)

Given the following contract:

If the specified key is not already associated with a value or is associated with null, associates it with the given value. Otherwise, replaces the value with the results of the given remapping function, or removes if the result is null.

So, the above code translates to the following atomic operation:

String value = map.get(key);
if (value == null)
    map.put(key, msg);
else
    map.put(key, value.concat(msg));

This is certainly not an everyday functionality and might just have leaked from an implementation to the top-level API. Additionally, if the map already contains null (so, null values are OK), and your remappingFunction returns null, then the entry is removed. That’s quite unexpected. Consider the following program:

map.put("X", null);
System.out.println(map.merge(
    "X", null, (v1, v2) -> null));
System.out.println(map);

Its output is:

null
{A=1, B=2, C=3}

Update: I first wrote the above code first with JDK 8 build 116. With build 129, things have changed completely again. First off, the value passed to merge() is not allowed to be null. Secondly. nullvalues are treated by merge() just like absent values. To produce the same output, we’ll write:

map.put("X", 1);
System.out.println(map.merge(
    "X", 1, (v1, v2) -> null));
System.out.println(map);

This merge() operation has thus removed a value from the map. That’s probably OK because the semantics of “merge” is often a combination of INSERT, UPDATE, and DELETE if we’re using SQL-speak. And a somewhat reasonable way to indicate that a value should be removed is to return null from such a function.

But the map is allowed to contain null values, which can never be inserted into the map using merge(). tweet this

getOrDefault()

This is a no-brainer. Right? Right! Wrong!

Unfortunately, there are two types of Maps. Those supporting null keys and/or values and those who don’t support nulls. While the previous merge() method didn’t distinguish between a map not containing a key and a map containing a key with a null value, this new getOrDefault() only returns the default when the key is not contained. It won’t protect you from a NullPointerException:

map.put("X", null);
try {
  System.out.println(map.getOrDefault("X", 21) + 21);
}
catch (NullPointerException nope) {
  nope.printStackTrace();
}

That’s quite a bummer. In general, it can be said the Map API has become even more complex with respect to nulls. tweet this

Trivial additions

There are a few more methods, like putIfAbsent() (pulled up from ConcurrentHashMap, remove() (with key and value arguments), replace().

Conclusion

All in all, it can be said that a lot of atomic operations have made it to the top-level Map API, which is good. But then again, the pre-existing confusion related to the semantics of null in maps has deepened. The terminologies “present” vs. “absent”, “contains”, “default” don’t necessarily help clarifying these things, which is surprisingly against the rules of keeping an API consistent and most importantly, regular. Thus as a consumer of this API, ideally, you should keep null out of maps, both as keys and as values!

Next week in this blog series, we’re going to look at how Java 8 will allow you to define local transactional scope very easily, so stay tuned!

More on Java 8

In the mean time, have a look at Eugen Paraschiv’s awesome Java 8 resources page