A Common Mistake Developers Make When Caching Nullable Values

Caching is hard in various ways. Whenever you’re caching things, you have to at least think of:
  • Memory consumption
  • Invalidation
In this article, I want to show a flaw that often sneaks into custom cache implementations, making them inefficient for some execution paths. I’ve encountered this flaw in Eclipse, recently.

What did Eclipse do wrong?

I periodically profile Eclipse using Java Mission Control (JMC) when I discover a performance issue in the compiler (and I’ve discovered a few). Just recently, I’ve found a new regression that must have been introduced with the new Java 9 module support in Eclipse 4.7.1a:
Luckily, the issue has already been fixed for 4.7.2 (https://bugs.eclipse.org/bugs/show_bug.cgi?id=526209). What happened? In that profiling session, I’ve found an awful lot of accesses to java.util.zip.ZipFile whenever I used the “content assist” feature (auto completion). This was the top stack trace in the profiler:
int java.util.zip.ZipFile$Source.hashN(byte[], int, int)
void java.util.zip.ZipFile$Source.initCEN(int)
void java.util.zip.ZipFile$Source.(ZipFile$Source$Key, boolean)
ZipFile$Source java.util.zip.ZipFile$Source.get(File, boolean)
void java.util.zip.ZipFile.(File, int, Charset)
void java.util.zip.ZipFile.(File, int)
void java.util.zip.ZipFile.(File)
ZipFile org.eclipse.jdt.internal.core.JavaModelManager.getZipFile(IPath, boolean)
ZipFile org.eclipse.jdt.internal.core.JavaModelManager.getZipFile(IPath)
ZipFile org.eclipse.jdt.internal.core.JarPackageFragmentRoot.getJar()
byte[] org.eclipse.jdt.internal.core.AbstractClassFile.getClassFileContent(JarPackageFragmentRoot, String)
IBinaryModule org.eclipse.jdt.internal.core.ModularClassFile.getJarBinaryModuleInfo()
IBinaryModule org.eclipse.jdt.internal.core.ModularClassFile.getBinaryModuleInfo()
boolean org.eclipse.jdt.internal.core.ModularClassFile.buildStructure(...)
void org.eclipse.jdt.internal.core.Openable.generateInfos(Object, HashMap, IProgressMonitor)
Object org.eclipse.jdt.internal.core.JavaElement.openWhenClosed(Object, boolean, IProgressMonitor)
Object org.eclipse.jdt.internal.core.JavaElement.getElementInfo(IProgressMonitor)
Object org.eclipse.jdt.internal.core.JavaElement.getElementInfo()
boolean org.eclipse.jdt.internal.core.JavaElement.exists()
boolean org.eclipse.jdt.internal.core.Openable.exists()
IModuleDescription org.eclipse.jdt.internal.core.PackageFragmentRoot.getModuleDescription()
IModuleDescription org.eclipse.jdt.internal.core.NameLookup.getModuleDescription(IPackageFragmentRoot, Map, Function)
...
In fact, the profiling session doesn’t show the exact number of accesses, but the number of stack trace samples that contained the specific method(s) which corresponds to the time spent inside of a method, not the number of calls (which is less relevant). Clearly, accessing zip files shouldn’t be the thing that Eclipse should be doing most of the time, when auto completing my code. So, why did it do it anyway? It turns out, the problem was in the method getModuleDescription(), which can be summarised as follows:

static IModuleDescription getModuleDescription(
    IPackageFragmentRoot root, 
    Map<IPackageFragmentRoot,IModuleDescription> cache, 
    Function<IPackageFragmentRoot,IClasspathEntry> rootToEntry
) {
    IModuleDescription module = cache.get(root);
    if (module != null)
        return module;

    ...
    // Expensive call to open a Zip File in these calls:
    if (root.getKind() == IPackageFragmentRoot.K_SOURCE)
        module = root.getJavaProject().getModuleDescription();
    else
        module = root.getModuleDescription();

    if (module == null) {
        ...
    }

    if (module != null)
        cache.put(root, module);
    return module;
}

The ZipFile access is hidden inside the getModuleDescription() call. A debugger revealed that the JDK’s rt.jar file was opened quite a few times to look for a module-info.class file. Can you spot the mistake in the code? The method gets an external cache that may already contain the method’s result. But the method may also return null in case there is no module description. Which there isn’t. jOOQ has not yet been modularised, and most libraries on which jOOQ depends haven’t been modularised either, nor has the JDK been modularised using which jOOQ is currently built (JDK 8). So, this method always returns null for non-modular stuff. But if it returns null, it won’t put anything in the cache:

    if (module != null)
        cache.put(root, module);
    return module;
}

… which means the next time it is called, there’s a cache miss:

    IModuleDescription module = cache.get(root);
    if (module != null)
        return module;

… and the expensive logic involving the ZipFile call is invoked again. In other words, it is invoked all the time (for us).

Caching optional values

This is an important thing to always remember, and it is not easy to remember. Why? Because the developer who implemented this cache implemented it for the “happy path” (from the perspective of someone working with modules). They probably tried their code with a modular project, in case of which the cache worked perfectly. But they didn’t check if the code still works for everyone else. And in fact, it does work. The logic isn’t wrong. It’s just not optimal. The solution to these things is simple. If the value null encodes a cache miss, we need another “PSEUDO_NULL” to encode the actual null value, or in this case something like NO_MODULE. So, the method can be rewritten as:

static IModuleDescription getModuleDescription(
    IPackageFragmentRoot root, 
    Map<IPackageFragmentRoot,IModuleDescription> cache, 
    Function<IPackageFragmentRoot,IClasspathEntry> rootToEntry
) {
    IModuleDescription module = cache.get(root);

    // Decode encoded NO_MODULE value:
    if (module == NO_MODULE)
        return null;
    if (module != null)
        return module;

    module = ...

    if (module != null)
        cache.put(root, module);

    // Encode null value:
    else
        cache.put(root, NO_MODULE);

    return module;
}

… where this NO_MODULE can be a simple java.lang.Object if you don’t care about generics, or a dummy IModuleDescription in our case:

static final IModuleDescription NO_MODULE = 
  new IModuleDescription() { ... };

Since it will be a singleton instance, we can use identity comparisons in our method.

Conclusion

When caching method results, always check if null is a valid result for the method. If it is, and if your cache is a simple Map, then you have to encode the null value with some sort of NO_MODULE value for the cache to work properly. Otherwise, you won’t be able to distinguish Map.get(key) == null for the cases:
  • Cache miss and Map returns null
  • Cache hit and the value is null
Update after some useful reddit / DZone comments As /u/RayFowler pointed out on this article’s reddit discussion, the concept illustrated here is called “negative caching” Something that is often forgotten when performing negative caching is the fact that exceptions are also a result, as pointed out by /u/zombifai in the same reddit discussion. The fix in Eclipse correctly took this into account as can be seen here: https://git.eclipse.org/c/jdt/eclipse.jdt.core.git/commit/?id=addfd789e17dbb99af0304912ef45e4ae72c0605 While a Map.containsKey() based solution would work in a similar way and would have the advantage of not needing a “dummy” / sentinel value, it is not a good approach in situations where performance really matters – remember that in this case, we’re talking about an Eclipse compiler optimisation where we really don’t want two Map lookups where one would suffice. This is a generally interesting thought for caches, which are introduced after all to improve performance!

4 thoughts on “A Common Mistake Developers Make When Caching Nullable Values

  1. Pretty good tip! I’ll pay more atention next time I implement a kind of cache in my code! Thanks!

  2. When your cache is a simple Map you can just use containsKey(), which according to its Javadoc “Returns true if this map contains a mapping for the specified key” independently of whether that value is null or not.

    1. Sure, that’s logically correct. But since you’re using a cache already, which is supposed to be a speed-up, it’s probably much faster to use a sentinel value, rather than two consecutive lookups. I’ve been hit by caches that turned into bottlenecks in the past, so in this particular case (Eclipse compiler), I think the mentioned approach in this article is really superior, see e.g. https://bugs.eclipse.org/bugs/show_bug.cgi?id=528761

      The ideal solution of course would be a map that returns a union type Value|Null|NotFound

Leave a Reply to rponteCancel reply