Java 8 Friday Goodies: Lambdas and XML

At Data Geekery, we love Java. And as we’re really into jOOQ’s fluent API and query DSL, we’re absolutely thrilled about what Java 8 will bring to our ecosystem. We have blogged a couple of times about some nice Java 8 goodies, and now we feel it’s time to start a new blog series, the…

Java 8 Friday

Every Friday, we’re showing you a couple of nice new tutorial-style Java 8 features, which take advantage of lambda expressions, extension methods, and other great stuff. You’ll find the source code on GitHub. tweet this

Java 8 Goodie: Lambdas and XML

There isn’t too much that Java 8 can do to the existing SAX and DOM APIs. The SAX ContentHandler has too many abstract methods to qualify as a @FunctionalInterface, and DOM is a huge, verbose API specified by w3c, with little chance of adding new extension methods.

Luckily there is a small Open Source library called jOOX that allows for processing the w3c standard DOM API through a wrapper API that mimicks the popular jQuery library. jQuery leverages JavaScript’s language features by allowing users to pass functions to the API for DOM traversal. The same is the case with jOOX. Let’s have a closer look:

Assume that we’re using the following pom.xml

<?xml version="1.0" encoding="UTF-8"?>
<project xmlns="http://maven.apache.org/POM/4.0.0"
     xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
     xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
  <modelVersion>4.0.0</modelVersion>

  <groupId>org.jooq</groupId>
  <artifactId>java8-goodies</artifactId>
  <version>1.0-SNAPSHOT</version>

  <dependencies>
    <dependency>
      <groupId>org.jooq</groupId>
      <artifactId>joox</artifactId>
      <version>1.2.0</version>
    </dependency>
  </dependencies>

  <build>
    <plugins>
      <plugin>
        <groupId>org.apache.maven.plugins</groupId>
        <artifactId>maven-compiler-plugin</artifactId>
        <version>2.3.2</version>
        <configuration>
          <fork>true</fork>
          <maxmem>512m</maxmem>
          <meminitial>256m</meminitial>
          <encoding>UTF-8</encoding>
          <source>1.8</source>
          <target>1.8</target>
          <debug>true</debug>
          <debuglevel>lines,vars,source</debuglevel>
        </configuration>
      </plugin>
    </plugins>
  </build>
</project>

Let’s assume we wanted to know all the involved artifacts in Maven’s groupId:artifactId:version notation. Here’s how we can do that with jOOX and lambda expressions:

$(new File("./pom.xml")).find("groupId")
                        .each(ctx -> {
    System.out.println(
        $(ctx).text() + ":" +
        $(ctx).siblings("artifactId").text() + ":" +
        $(ctx).siblings("version").text()
    );
});

Executing the above yields:

org.jooq:java8-goodies:1.0-SNAPSHOT
org.jooq:joox:1.2.0
org.apache.maven.plugins:maven-compiler-plugin:2.3.2

Let’s assume we only wanted to display those artifacts that don’t have SNAPSHOT in their version numbers. Simply add a filter:

$(new File("./pom.xml"))
    .find("groupId")
    .filter(ctx -> $(ctx).siblings("version")
                         .matchText(".*-SNAPSHOT")
                         .isEmpty())
    .each(ctx -> {
        System.out.println(
        $(ctx).text() + ":" +
        $(ctx).siblings("artifactId").text() + ":" +
        $(ctx).siblings("version").text());
    });

This will now yield

org.jooq:joox:1.2.0
org.apache.maven.plugins:maven-compiler-plugin:2.3.2

We can also transform the XML content. For instance, if the target document doesn’t need to be a POM, we could replace the matched groupId elements by an artifical artifact element that contains the artifact name in Maven notation. Here’s how to do this:

$(new File("./pom.xml"))
    .find("groupId")
    .filter(ctx -> $(ctx).siblings("version")
                         .matchText(".*-SNAPSHOT")
                         .isEmpty())
    .content(ctx ->
        $(ctx).text() + ":" +
        $(ctx).siblings("artifactId").text() + ":" +
        $(ctx).siblings("version").text()
    )
    .rename("artifact")
    .each(ctx -> System.out.println(ctx));

The above puts new content in place of the previous one through .content(), and then renames the groupId elements to artifact, before printing out the element. The result is:

<artifact>org.jooq:joox:1.2.0</artifact>
<artifact>org.apache.maven.plugins:maven-compiler-plugin:2.3.2</artifact>

More goodies next week

What becomes immediately obvious is the fact that the lambda expert group’s choice of making all SAMs (Single Abstract Method interfaces) eligible for use with lambda expressions adds great value to pre-existing APIs. Quite a clever move.

But there are also new APIs. Last week, we have discussed how the existing JDK 1.2 File API can be improved through the use of lambdas. Some of our readers have expressed their concerns that the java.io API has been largely replaced by java.nio (nio as in New I/O). Next week, we’ll have a look at Java 8’s java.nnio API (for new-new I/O ;-) ) and how it relates to the Java 8 Streams API.

More on Java 8

In the mean time, have a look at Eugen Paraschiv’s awesome Java 8 resources page

A jOOX First-Time Experience Article

Here’s some nice first-time user experience about jOOX, my lesser-known product:
http://www.kubrynski.com/2013/03/as-developer-i-want-to-use-xml.html

As a reminder, here’s what jOOX is all about:

jOOX stands for Java Object Oriented XML. It is a simple wrapper for the org.w3c.dom package, to allow for fluent XML document creation and manipulation where DOM is required but too verbose. jOOX only wraps the underlying document and can be used to enhance DOM, not as an alternative.

Unlike other, similar tools that mimick jQuery (e.g. jsoup, jerry, gwtquery), jOOX really aims to leverage standard w3c DOM usage, which isn’t such a bad thing after all, with its performant, standard Xerces implementation.

Some simple example code:

// Find the order at index 4 and 
// add an element "paid"
$(document).find("orders")
           .children().eq(4)
           .append("<paid>true</paid>");

// Find those orders that are paid 
// and flag them as "settled"
$(document).find("orders")
           .children().find("paid")
           .after("<settled>true</settled>");

jOOX and XSLT. An XML love story, continued

jOOX - a jQuery port to Java The somewhat functional way of thinking involved with jOOX’s XML manipulation cries for an additional API enhancement simply supporting XSLT. XSL transformation has become quite a standard way of transforming large amounts of XML into other structures, where normal DOM manipulation (or jOOX manipulation) becomes too tedious. Let’s have a look at how things are done in standard Java

Example input:

<books>
  <book id="1"/>
  <book id="2"/>
</books>

Example XSL:

<?xml version="1.0" encoding="ISO-8859-1"?>
<xsl:stylesheet version="1.0"
    xmlns:xsl="http://www.w3.org/1999/XSL/Transform">

    <!-- Match all books and increment their IDs -->
    <xsl:template match="book">
        <book id="{@id + 1}">
            <xsl:apply-templates/>
        </book>
    </xsl:template>

    <!-- Identity-transform all the other elements and attributes -->
    <xsl:template match="@*|*">
        <xsl:copy>
            <xsl:apply-templates select="*|@*"/>
        </xsl:copy>
    </xsl:template>
</xsl:stylesheet>

Verboseness of XSL transformation in Java

The standard way of doing XSL transformation in Java is pretty verbose – as just about anything XML-related in standard Java. See an example of how to apply the above transformation:

Source source = new StreamSource(new File("increment.xsl"));
TransformerFactory factory = TransformerFactory.newInstance();
Transformer transformer = factory.newTransformer(source);
DOMResult result = new DOMResult();
transformer.transform(new DOMSource(document), result);

Node output = result.getNode();

Drastically decrease verbosity with jOOX

With jOOX, you can write exactly the same in much less code:

Apply transformation:
// Applies transformation to the document element:
$(document).transform("increment.xsl");

// Applies transformation to every book element:
$(document).find("book").transform("increment.xsl");

The result in both cases is:

<books>
  <book id="2"/>
  <book id="3"/>
</books>

CSS selectors in Java

CSS selectors are a nice and intuitive alternative to XPath for DOM navigation. While XPath is more complete and has more functionality, CSS selectors were tailored for HTML DOM, where the document content is usually less structured than in XML.

Here are some examples of CSS selector and equivalent XPath expressions:

CSS:   document > library > books > book
XPath: //document/library/books/book

CSS:   document book
XPath: //document//book

CSS:   document book#id3
XPath: //document//book[@id='3']

CSS:   document book[title='CSS for dummies']
XPath: //document//book[@title='CSS for dummies']

 

This becomes more interesting when implementing pseudo-selectors in XPath:

CSS:   book:first-child
XPath: //book[not(preceding-sibling::*)]

CSS:   book:empty
XPath: //book[not(*|@*|node())]

 

A very nice library that allows for parsing selector expressions according to the w3c specification is this “css-selectors” by Christer Sandberg:

https://github.com/chrsan/css-selectors

The next version of jOOX will include css-selector’s parser for simpler DOM navigation. The following two expressions will hold the same result:

Match match1 = $(document).find("book:empty");
Match match2 = $(document).xpath("//book[not(*|@*|node())]");

Use Xalan’s extension functions natively in jOOX

jOOX - a jQuery port to Java jOOX aims at increased ease of use when dealing with Java’s rather complex XML API’s. One example of such a complex API is Xalan, which has a lot of nice functionality, such as its extension namespaces. When you use Xalan, you may have heard of those extensions as documented here:

http://exslt.org

These extensions can typically be used in XSLT. An example is the math:max function:

<!-- Source -->
<values>
   <value>7</value>
   <value>11</value>
   <value>8</value>
   <value>4</value>
</values>

<!-- Stylesheet -->
<xsl:template match="values">
   <result>
      <xsl:text>Maximum: </xsl:text>
      <xsl:value-of select="math:max(value)" />
   </result>
</xsl:template>

<!-- Result -->
<result>Maximum: 11</result>

But in fact, math:max can be used in any type of XPath expression, also the ones that are directly created in Java. Here’s how you can do this:

Document document = // ... this is the DOM document

// Create an XPath object
XPathFactory factory = XPathFactory.newInstance();
XPath xpath = factory.newXPath();

// Initialise Xalan extensions on the XPath object
xpath.setNamespaceContext(
  new org.apache.xalan.extensions.ExtensionNamespaceContext());
xpath.setXPathFunctionResolver(
  new org.apache.xalan.extensions.XPathFunctionResolverImpl());

// Evaluate an expression using an extension function
XPathExpression expression = xpath.compile(
  "//value[number(.) = math:max(//value)]");
NodeList result = (NodeList) expression.evaluate(
  document, XPathConstants.NODESET);

// Iterate over results
for (int i = 0; i < result.getLength(); i++) {
  System.out.println(result.item(i).getTextContent());
}

jOOX is much more convenient

The above is pretty verbose. With jOOX, you can do exactly the same, but with a lot less code:

Document document = // ... this is the DOM document

// jOOX's xpath method already supports Xalan extensions
for (Match value : $(document).xpath(
    "//value[number(.) = math:max(//value)]").each()) {
  System.out.println(value.text());
}

jOOX answers many Stack Overflow questions

jOOX - a jQuery port to Java When you search for Stack Overflow questions regarding XML, DOM, XPath, JAXB, etc, you could very often answer them simply with an example involving jOOX. Take this question extract for example:

Goal

My goal is to achieve following from this ex xml file :

<root>
    <elemA>one</elemA>
    <elemA attribute1='first' attribute2='second'>two</elemA>
    <elemB>three</elemB>
    <elemA>four</elemA>
    <elemC>
        <elemB>five</elemB>
    </elemC>
</root>

to produce the following :

//root[1]/elemA[1]='one'
//root[1]/elemA[2]='two'
//root[1]/elemA[2][@attribute1='first']
//root[1]/elemA[2][@attribute2='second']
//root[1]/elemB[1]='three'
//root[1]/elemA[3]='four'
//root[1]/elemC[1]/elemB[1]='five'

jOOX answer

The above can be achieved quite simply with jOOX (compared to the other answers to the question involving XSLT, SAX, DOM, etc):

List<String> list = $(document).xpath("//*[not(*)]").map(new Mapper<String>() {
  public String map(Context context) {
    return $(context).xpath() + "='" + $(context).text() + "'";
  }
}});

This will produce

/root[1]/elemA[1]='one'
/root[1]/elemA[2]='two'
/root[1]/elemB[1]='three'
/root[1]/elemA[3]='four'
/root[1]/elemC[1]/elemB[1]='five'

It is an “almost” solution to the OP’s problem, jOOX does not (yet) support matching/mapping attributes. Hence, attributes will not produce any output. This will be implemented in the near future, though.

See the full answer and question here:

http://stackoverflow.com/questions/4746299/generate-get-xpath-from-xml-node-java/8943144#8943144

jOOX and JAXB

jOOX - a jQuery port to Java jOOX has been awfully quiet lately due to increased development focus in jOOQ. Nevertheless, the jOOX feature roadmap is full of promising new features. Unlike its inspiration jquery, jOOX is positioning itself in the Java world, where many XML API’s already exist. One of the most important XML APIs in Java is JAXB, a very simple means of mapping XML to Java through annotations (see also my blog stream on the subject of Annotatiomania™).

Let’s have a look at this small XML document

<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<customer id="13">
    <age>30</age>
    <name>Lukas</name>
</customer>

Typically, we would write a Java class like this to map to the above XML document:

@XmlRootElement
public class Customer {
    String name;
    int age;
    int id;

    public String getName() {
        return name;
    }

    @XmlElement
    public void setName(String name) {
        this.name = name;
    }

    public int getAge() {
        return age;
    }

    @XmlElement
    public void setAge(int age) {
        this.age = age;
    }

    public int getId() {
        return id;
    }

    @XmlAttribute
    public void setId(int id) {
        this.id = id;
    }
}

And then, we would marshal / unmarshal the above using the following code snippet:

JAXB.marshal(new Customer(), System.out);
Customer c = JAXB.unmarshal(xml, Customer.class);

JAXB and jOOX

This is very neat and convenient. But it gets even better when JAXB is used along with jOOX. Have a look at the following piece of code:

// Use the $ method to wrap a JAXB-annotated object:
$(new Customer());

// Navigate to customer elements in XML:
String id   = $(new Customer()).id();
String name = $(new Customer()).find("name").text();

// Modify the XML structure, and unmarshal it again into 
// a JAXB-annotated object:
Match match = $(new Customer());
match.find("name").text("Peter");
Customer modified = match.unmarshalOne(Customer.class);

Check back soon on jOOX for new features!