Java 8 Friday Goodies: Easy-as-Pie Local Caching

At Data Geekery, we love Java. And as we’re really into jOOQ’s fluent API and query DSL, we’re absolutely thrilled about what Java 8 will bring to our ecosystem. We have blogged a couple of times about some nice Java 8 goodies, and now we feel it’s time to start a new blog series, the…

Java 8 Friday

Every Friday, we’re showing you a couple of nice new tutorial-style Java 8 features, which take advantage of lambda expressions, extension methods, and other great stuff. You’ll find the source code on GitHub. tweet this

Java 8 Goodie: Easy-as-Pie Local Caching

Now get ready for one of the most awesome revelations in this series so far. We’ll show you an easy-as-pie way to implement a local cache using the good old HashMap and lambda expressions. Because, Map now has a new way of a atomically calculating a new value in case a key is absent. Perfect for caches. Let’s delve into code:

public static void main(String[] args) {
    for (int i = 0; i < 10; i++)
        System.out.println(
            "f(" + i + ") = " + fibonacci(i));
}

static int fibonacci(int i) {
    if (i == 0)
        return i;

    if (i == 1)
        return 1;

    System.out.println("Calculating f(" + i + ")");
    return fibonacci(i - 2) + fibonacci(i - 1);
}

Yes. That’s the naive way of doing things. Even for small numbers like fibonacci(5), the above algorithm will print out a huge amount of lines, as we’re repeating the same calculations exponentially:

Calculating f(6)
Calculating f(4)
Calculating f(2)
Calculating f(3)
Calculating f(2)
Calculating f(5)
Calculating f(3)
Calculating f(2)
Calculating f(4)
Calculating f(2)
Calculating f(3)
Calculating f(2)
f(6) = 8

What we want to do is build a cache of previously calculated fibonacci numbers. The most straightforward technique is to memoize all values in a cache. Here’s how we build a cache:

static Map<Integer, Integer> cache = new HashMap<>();

Important side note: A previous version of the blog post used a ConcurrentHashMap here. DO NOT USE ConcurrentHashMaps when you recursively calculate values using computeIfAbsent(). Details here:

https://stackoverflow.com/q/28840047/521799

Done! As mentioned before, we’re using the newly added Map.computeIfAbsent() method to calculate a new value from a source function only if we don’t already have a value for a given key. Caching! And since this method is guaranteed to execute atomically, and since we’re using a ConcurrentHashMap, this cache is even thread-safe without resorting to manually applying synchronized anywhere. And it can be reused for stuff other than calculating fibonacci numbers. But let’s first apply this cache to our fibonacci() function.

static int fibonacci(int i) {
    if (i == 0)
        return i;

    if (i == 1)
        return 1;

    return cache.computeIfAbsent(i, (key) ->
                 fibonacci(i - 2)
               + fibonacci(i - 1));
}

That’s it. It can’t get any simpler than this! Want proof? We’ll log a message on the console every time we actually evaluate a new value:

static int fibonacci(int i) {
    if (i == 0)
        return i;

    if (i == 1)
        return 1;

    return cache.computeIfAbsent(i, (key) -> {
        System.out.println(
            "Slow calculation of " + key);

        return fibonacci(i - 2) + fibonacci(i - 1);
    });
}

The above program will print

f(0) = 0
f(1) = 1
Slow calculation of 2
f(2) = 1
Slow calculation of 3
f(3) = 2
Slow calculation of 4
f(4) = 3
Slow calculation of 5
f(5) = 5
Slow calculation of 6
f(6) = 8
Slow calculation of 7
f(7) = 13
Slow calculation of 8
f(8) = 21
Slow calculation of 9
f(9) = 34

How would we have done it in Java 7?

Good question. With lots of code. We’d probably write something like this using double-checked locking:

static int fibonacciJava7(int i) {
    if (i == 0)
        return i;

    if (i == 1)
        return 1;

    Integer result = cache.get(i);
    if (result == null) {
        synchronized (cache) {
            result = cache.get(i);

            if (result == null) {
                System.out.println(
                    "Slow calculation of " + i);

                result = fibonacci(i - 2) 
                       + fibonacci(i - 1);
                cache.put(i, result);
            }
        }
    }

    return result;
}

Convinced?

Note, that your actual solution would probably make use of Guava Caches.

Conclusion

Lambdas are only one part of Java 8. An important part, but let’s not forget all the new features that were added to the libraries and that can be leveraged with lambdas now!

 

Next week in this blog series, we’re going to look at how Java 8 will improve on the existing and new concurrency APIs, so stay tuned!

More on Java 8

In the mean time, have a look at Eugen Paraschiv’s awesome Java 8 resources page

13 thoughts on “Java 8 Friday Goodies: Easy-as-Pie Local Caching

  1. If a constructor of ConcurrentHashMap could take a lambda to be called when a value is not present, then you could do away with fibonacci() entirely and just reference the map.

  2. An alternate implementation:

    import java.util.Map;
    import java.util.concurrent.ConcurrentHashMap;
    
    public interface LocalCaching {
      public static final Map cache =
        new ConcurrentHashMap()
      ;
    
      public static int fibonacciIfAbsent(int i) {
        Logger.println("Slow calculation of f(" + i + ")");
        return fibonacci(i - 2) + fibonacci(i - 1);
      }
    
      public static int fibonacci(int i) {
        if (i == 0) {
          return i;
        }
     
        if (i == 1) {
          return 1;
        }
     
        return cache.computeIfAbsent(
          i,
          LocalCaching::fibonacciIfAbsent
        );
      }
    
      public static void main(String... arguments) {
        for (int i = 0; i < 10; i++) {
          Logger.println(
            "f(" + i + ") = " + fibonacci(i)
          );
        }
      }
    }
    
  3. Quoting the documentation (http://download.java.net/jdk8/docs/api/java/util/Map.html#computeIfAbsent-K-java.util.function.Function-):

    “The default implementation makes no guarantees about synchronization or atomicity properties of this method. Any implementation providing atomicity guarantees must override this method and document its concurrency properties. In particular, all implementations of subinterface ConcurrentMap must document whether the function is applied once atomically only if the value is not present.”

    And if we go to ConcurrentMap (http://download.java.net/jdk8/docs/api/java/util/concurrent/ConcurrentMap.html#computeIfAbsent-K-java.util.function.Function-) we get:

    “The default implementation may retry these steps when multiple threads attempt updates including potentially calling the mapping function multiple times.”

    And if we look into the implementation (https://hg.openjdk.java.net/jdk8u/jdk8u-dev/jdk/file/3c891a39428a/src/share/classes/java/util/concurrent/ConcurrentHashMap.java#l1643) we see that the mapping function is called multiple times releasing/reaquiring the lock.

    So no, you can’t suppose that your code will be called under a lock and this interface (nor the implementation in CHM) doesn’t protect against the “dogpile effect” (http://hype-free.blogspot.ro/2008/05/avoiding-dogpile-effect.html).

    Of course this doesn’t matter for the example since the function is trivial and doesn’t need synchronization anyway :-)

      1. Well, shame on me then for not taking the investigation to the logical end and looking up the CHM documentation not just CM :-)

        The documentation for CHM goes guarantee the “at most once” semantics you talk about in your article.

        I’m still not sure if the source for CHM implements that correctly though (on a glance it *can* call the mapping function multiple times but it’s dense enough to need more than a glance to confirm this). Just to be on the safe side I still would write code to conform to the guarantees in Map not the ones from CHM (it’s too easy to use the mapping function on a different map) or even better, make it stateless and forget about all these concerns.

        1. No shame here :-) This whole API was a moving target up until recently. For instance, null behaviour changed incompatibly and significantly in the last three months. I wouldn’t be surprised if these new Collections API parts are not entirely production-ready, yet, or if some implementation is not entirely correct.

          Just to be on the safe side I still would write code to conform to the guarantees in Map not the ones from CHM

          This is one of the hardest things in Java. While you may think that it is safe to conform to higher-level API like Map, you can get things quite wrong if you use an actual implementation. An example for this is the equals() / hashCode() contract imposed by Map, which is “completely violated” by SortedMap, which bases equality on compareTo() rather than equals() / hashCode()

  4. Hi, I know I am nearly a year late on this, but I just recently started to look in Java 8. So I stumbled across this example and tried it out. I started by calling fibonacci(7), which worked fine. But when I call fibonacci(13) or higher the program gets stuck and does not return. I removed the whole caching part and it returns again. When I change the cache to a normal Hashmap, the method returns. Has somebody else noticed this? Is there a bug in the ConcurrentHashMap? I used the exact same code that was posted in this article.

    1. I don’t experience any issues… calling this program:

      public class Test {
          static Map<Integer, Integer> cache = new ConcurrentHashMap<>();
      
          public static void main(String[] args) {
                  System.out.println(
                      "f(" + 13 + ") = " + fibonacci(13));
          }
      
          static int fibonacci(int i) {
              if (i == 0)
                  return i;
      
              if (i == 1)
                  return 1;
      
              return cache.computeIfAbsent(i, (key) -> {
                  System.out.println(
                      "Slow calculation of " + key);
      
                  return fibonacci(i - 2) + fibonacci(i - 1);
              });
          }
      }
      

      yields…

      Slow calculation of 13
      Slow calculation of 11
      Slow calculation of 9
      Slow calculation of 7
      Slow calculation of 5
      Slow calculation of 3
      Slow calculation of 2
      Slow calculation of 4
      Slow calculation of 6
      Slow calculation of 8
      Slow calculation of 10
      Slow calculation of 12
      f(13) = 233
      
      1. That may sound really stupid, but have you tried it with a number above 13? I again c&p your code and it ran just fine. Then I put in a 25 and got this:

            Slow calculation of 25
            Slow calculation of 23
            Slow calculation of 21
            Slow calculation of 19
            Slow calculation of 17
            Slow calculation of 15
            Slow calculation of 13
            Slow calculation of 11
        

        Changing

             static Map cache = new ConcurrentHashMap();
        

        to

             static Map cache = new HashMap();
        

        executes fine again:

            Slow calculation of 25
            Slow calculation of 23
            Slow calculation of 21
            Slow calculation of 19
            Slow calculation of 17
            Slow calculation of 15
            Slow calculation of 13
            Slow calculation of 11
            Slow calculation of 9
            Slow calculation of 7
            Slow calculation of 5
            Slow calculation of 3
            Slow calculation of 2
            Slow calculation of 4
            Slow calculation of 6
            Slow calculation of 8
            Slow calculation of 10
            Slow calculation of 12
            Slow calculation of 14
            Slow calculation of 16
            Slow calculation of 18
            Slow calculation of 17
            Slow calculation of 20
            Slow calculation of 19
            Slow calculation of 22
            Slow calculation of 21
            Slow calculation of 24
            Slow calculation of 23
            f(25) = 75025
        

        I should try that on another computer. :)

        1. I’ll be damned! I can reproduce this. Either, this is a bug or I violated the computeIfAbsent() contract by recursing. I’m afraid it’s a very subtle contract difference between Map / HashMap and ConcurrentHashMap.

          I’ve documented this on this Stack Overflow question as a warning to everyone:

          https://stackoverflow.com/q/28840047/521799

          Thanks a lot for pointing this out! Important discovery. I’ll fix the blog post

          1. Oh nice. Glad to help, kind of. ;) And of course, thanks for looking into it. I’ll follow the SO post for further updates.

Leave a Reply