CSS selectors in Java

CSS selectors are a nice and intuitive alternative to XPath for DOM navigation. While XPath is more complete and has more functionality, CSS selectors were tailored for HTML DOM, where the document content is usually less structured than in XML.

Here are some examples of CSS selector and equivalent XPath expressions:

CSS:   document > library > books > book
XPath: //document/library/books/book

CSS:   document book
XPath: //document//book

CSS:   document book#id3
XPath: //document//book[@id='3']

CSS:   document book[title='CSS for dummies']
XPath: //document//book[@title='CSS for dummies']

 

This becomes more interesting when implementing pseudo-selectors in XPath:

CSS:   book:first-child
XPath: //book[not(preceding-sibling::*)]

CSS:   book:empty
XPath: //book[not(*|@*|node())]

 

A very nice library that allows for parsing selector expressions according to the w3c specification is this “css-selectors” by Christer Sandberg:

https://github.com/chrsan/css-selectors

The next version of jOOX will include css-selector’s parser for simpler DOM navigation. The following two expressions will hold the same result:

Match match1 = $(document).find("book:empty");
Match match2 = $(document).xpath("//book[not(*|@*|node())]");

Use Xalan’s extension functions natively in jOOX

jOOX - a jQuery port to Java jOOX aims at increased ease of use when dealing with Java’s rather complex XML API’s. One example of such a complex API is Xalan, which has a lot of nice functionality, such as its extension namespaces. When you use Xalan, you may have heard of those extensions as documented here:

http://exslt.org

These extensions can typically be used in XSLT. An example is the math:max function:

<!-- Source -->
<values>
   <value>7</value>
   <value>11</value>
   <value>8</value>
   <value>4</value>
</values>

<!-- Stylesheet -->
<xsl:template match="values">
   <result>
      <xsl:text>Maximum: </xsl:text>
      <xsl:value-of select="math:max(value)" />
   </result>
</xsl:template>

<!-- Result -->
<result>Maximum: 11</result>

But in fact, math:max can be used in any type of XPath expression, also the ones that are directly created in Java. Here’s how you can do this:

Document document = // ... this is the DOM document

// Create an XPath object
XPathFactory factory = XPathFactory.newInstance();
XPath xpath = factory.newXPath();

// Initialise Xalan extensions on the XPath object
xpath.setNamespaceContext(
  new org.apache.xalan.extensions.ExtensionNamespaceContext());
xpath.setXPathFunctionResolver(
  new org.apache.xalan.extensions.XPathFunctionResolverImpl());

// Evaluate an expression using an extension function
XPathExpression expression = xpath.compile(
  "//value[number(.) = math:max(//value)]");
NodeList result = (NodeList) expression.evaluate(
  document, XPathConstants.NODESET);

// Iterate over results
for (int i = 0; i < result.getLength(); i++) {
  System.out.println(result.item(i).getTextContent());
}

jOOX is much more convenient

The above is pretty verbose. With jOOX, you can do exactly the same, but with a lot less code:

Document document = // ... this is the DOM document

// jOOX's xpath method already supports Xalan extensions
for (Match value : $(document).xpath(
    "//value[number(.) = math:max(//value)]").each()) {
  System.out.println(value.text());
}

jOOX answers many Stack Overflow questions

jOOX - a jQuery port to Java When you search for Stack Overflow questions regarding XML, DOM, XPath, JAXB, etc, you could very often answer them simply with an example involving jOOX. Take this question extract for example:

Goal

My goal is to achieve following from this ex xml file :

<root>
    <elemA>one</elemA>
    <elemA attribute1='first' attribute2='second'>two</elemA>
    <elemB>three</elemB>
    <elemA>four</elemA>
    <elemC>
        <elemB>five</elemB>
    </elemC>
</root>

to produce the following :

//root[1]/elemA[1]='one'
//root[1]/elemA[2]='two'
//root[1]/elemA[2][@attribute1='first']
//root[1]/elemA[2][@attribute2='second']
//root[1]/elemB[1]='three'
//root[1]/elemA[3]='four'
//root[1]/elemC[1]/elemB[1]='five'

jOOX answer

The above can be achieved quite simply with jOOX (compared to the other answers to the question involving XSLT, SAX, DOM, etc):

List<String> list = $(document).xpath("//*[not(*)]").map(new Mapper<String>() {
  public String map(Context context) {
    return $(context).xpath() + "='" + $(context).text() + "'";
  }
}});

This will produce

/root[1]/elemA[1]='one'
/root[1]/elemA[2]='two'
/root[1]/elemB[1]='three'
/root[1]/elemA[3]='four'
/root[1]/elemC[1]/elemB[1]='five'

It is an “almost” solution to the OP’s problem, jOOX does not (yet) support matching/mapping attributes. Hence, attributes will not produce any output. This will be implemented in the near future, though.

See the full answer and question here:

http://stackoverflow.com/questions/4746299/generate-get-xpath-from-xml-node-java/8943144#8943144

jOOX and JAXB

jOOX - a jQuery port to Java jOOX has been awfully quiet lately due to increased development focus in jOOQ. Nevertheless, the jOOX feature roadmap is full of promising new features. Unlike its inspiration jquery, jOOX is positioning itself in the Java world, where many XML API’s already exist. One of the most important XML APIs in Java is JAXB, a very simple means of mapping XML to Java through annotations (see also my blog stream on the subject of Annotatiomania™).

Let’s have a look at this small XML document

<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<customer id="13">
    <age>30</age>
    <name>Lukas</name>
</customer>

Typically, we would write a Java class like this to map to the above XML document:

@XmlRootElement
public class Customer {
    String name;
    int age;
    int id;

    public String getName() {
        return name;
    }

    @XmlElement
    public void setName(String name) {
        this.name = name;
    }

    public int getAge() {
        return age;
    }

    @XmlElement
    public void setAge(int age) {
        this.age = age;
    }

    public int getId() {
        return id;
    }

    @XmlAttribute
    public void setId(int id) {
        this.id = id;
    }
}

And then, we would marshal / unmarshal the above using the following code snippet:

JAXB.marshal(new Customer(), System.out);
Customer c = JAXB.unmarshal(xml, Customer.class);

JAXB and jOOX

This is very neat and convenient. But it gets even better when JAXB is used along with jOOX. Have a look at the following piece of code:

// Use the $ method to wrap a JAXB-annotated object:
$(new Customer());

// Navigate to customer elements in XML:
String id   = $(new Customer()).id();
String name = $(new Customer()).find("name").text();

// Modify the XML structure, and unmarshal it again into 
// a JAXB-annotated object:
Match match = $(new Customer());
match.find("name").text("Peter");
Customer modified = match.unmarshalOne(Customer.class);

Check back soon on jOOX for new features!

Loading CSV data with jOOQ

After the recent efforts made in jOOX, developments of jOOQ have been continued. The main new feature of the upcoming release 1.6.5 is the support for loading of CSV data. The jOOQ Factory will now provide access to a dedicated fluent API for loading CSV files into generated tables, specifying a field mapping and various other parameters related to the batch-processing of bulk loads. Some sample code of what the API might look like:

// The typical jOOQ factory
Factory create = new Factory(connection, SQLDialect.ORACLE);

// Configure and execute a Loader object
Loader<TAuthor> loader =
create.loadInto(AUTHOR)
      .onDuplicateKeyError()
      .onErrorAbort()
      .commitAll()
      .loadCSV("1;'Kafka'\n" +
               "2;Frisch")
      .fields(Author.ID, Author.LAST_NAME)
      .quote('\'')
      .separator(';')
      .ignoreRows(0)
      .execute();

// The resulting Loader object then holds various
// information about the loading process:

// The number of processed rows
int processed = loader.processed();

// The number of stored rows (INSERT or UPDATE)
int stored = loader.stored();

// The number of ignored rows (due to errors, or duplicate rule)
int ignored = loader.ignored();

// The errors that may have occurred during loading
List<LoaderError> errors = loader.errors();
LoaderError error = errors.get(0);

// The exception that caused the error
SQLException exception = error.exception();

// The row that caused the error
int rowIndex = error.rowIndex();
String[] row = error.row();

// The query that caused the error
Query query = error.query();

Along with the previously implemented export API, it is easy to export results from org.jooq.Result into CSV, let users modify them in Excel or any other office software, and upload the CSV again. Other ideas for future versions of jOOQ will also include loading data from XML and JSON data sources, “merging” data (i.e. including DELETE operations), etc.

Feedback is very welcome.

FluentDOM, another mimick of jQuery DOM manipulation, in PHP

The triumph of jQuery over any other XML API seems prominent, in many languages. Here is another example of a nice jQuery-port to PHP: FluentDOM.

http://fluentdom.github.com/

Similar to jOOX, FluentDOM aims to combine a jQuery-like fluent API with XPath and general DOM XML manipulation. Here are some simple examples taken from the FluentDOM documentation:

// read a file and set the message tag's content
echo FluentDOM($xmlFile)
  ->find('/message')
  ->text('Hello World!');

// Find the <root> first then the second element in it
var_dump($fd->find('/root')->find('*[2]')->item(0)->textContent);

// Append elements to an object
$menu
  ->append('<li/>')
  ->append('<a/>')
  ->attr('href', '/sample.php')
  ->text('Sample');

I’m in contact with the developers of FluentDOM. As always with OSS, there is great potential for synergy, which in the end will make both products better. For jOOX, this means that loading of files/streams is going to be a nice plus. XPath is already implemented in the upcoming release 0.9.2. On the other hand, maybe FluentDOM can get inspiration from jOOQ’s document creation syntax (which isn’t part of jQuery):

$("root",
  $("element",
    $("child", "text"),
    $("child", "more text")));

… which will create

<root>
  <element>
    <child>text</child>
    <child>more text</child>
  </element>
</root>

Excited as always, let’s get back to hacking! :-)

See the latest progress here: http://code.google.com/p/joox/

Another Fluent API: jOOX. Porting jQuery to Java

Recently, in my every day programming madness, I really felt the urge to kill someone involved with the formal specification of DOM. The beloved Document Object Model. While everyone understands that this API is complete in functionality and scope and it’s a standard, and it’s almost the same in every language…. well it’s incredibly verbose. Manipulating XML is about as fun and exciting as cleaning the dishes of a 2000-people Indian wedding.

And then, suddenly, I remembered that this is how I felt with Java’s support for advanced SQL and how JPA/CriteriaQuery made me feel like that poor dishwasher, before. And I wondered whether someone had felt like me before. So I asked this question on Stack Overflow:

http://stackoverflow.com/questions/6996013/a-nice-java-xml-dom-utility

And I got the expected answers about JDOM and dom4j. Two dinosaur projects that are neither sexier nor more efficient than the standard itself (e.g. Xerces). See this answer about performance:

http://stackoverflow.com/questions/6996013/a-nice-java-xml-dom-utility#6998870

I had also found one project, that has a somewhat fluent approach:

http://code.google.com/p/xmltool/

It looks quite nice, actually, although it is a bit biased towards DOM creation, not navigation. And then, it struck me like lightning: “Why hasn’t anyone ported jQuery” to Java, yet?? jQuery is exactly how an XML API should be: Awesome. Fluent. Fun, and efficient to use. So I tried to hack something together that looks like jQuery and that’s the beginning of another product in the “jOO-Star suite”: jOOX with X for XML! I wanted this to be fluent, and fun and efficient to use. Like jOOQ. So jOOX will be an attempt for doing precisely that. Here’s an example of what jOOX code looks like:

// Find the order at index for and add an element "paid"
joox(document).find("orders")
              .children()
              .eq(4)
              .append("<paid>true</paid>");

// Find those orders that are paid and flag them as "settled"
joox(document).find("orders")
              .children()
              .find("paid")
              .after("<settled>true</settled>");

This rapid prototype of a jQuery port looks very promising to me, even if the most important features aren’t there yet (e.g. navigation with expression languages, selectors, etc). With Java’s static typing and without all the browser-related issues and JavaScript event handling and CSS and all that, pure DOM navigation and manipulation is actually not that hard to wrap. In any case, I have now even more respect for the jQuery guys, as I’m just touching the tip of the iceberg.

So in the future, I will also post one or two entries about jOOX on this blog. Looking forward to feedback!

Download jOOX from Google Code:

http://code.google.com/p/joox/