One of the first things you'll probably want to do is to parse an XML document of some kind. This is easy to do in <dom4j>. The following code demonstrates how to do this.
import java.net.URL;
import org.dom4j.Document;
import org.dom4j.DocumentException;
import org.dom4j.io.SAXReader;
public class Foo {
public Document parse(URL url) throws DocumentException {
SAXReader reader = new SAXReader();
Document document = reader.read(url);
return document;
}
}
A document can be navigated using a variety of methods that return standard Java Iterators. For example
public void bar(Document document) throws DocumentException {
Element root = document.getRootElement();
// iterate through child elements of root
for (Iterator<Element> it = root.elementIterator(); it.hasNext();) {
Element element = it.next();
// do something
}
// iterate through child elements of root with element name "foo"
for (Iterator<Element> it = root.elementIterator("foo"); it.hasNext();) {
Element foo = it.next();
// do something
}
// iterate through attributes of root
for (Iterator<Attribute> it = root.attributeIterator(); it.hasNext();) {
Attribute attribute = it.next();
// do something
}
}
In <dom4j> XPath expressions can be evaluated on the Document
or on any Node
in the tree (such as Attribute
,
Element
or
ProcessingInstruction
). This allows complex navigation throughout the document with a single line of code.
For example
public void bar(Document document) {
List<Node> list = document.selectNodes("//foo/bar");
Node node = document.selectSingleNode("//foo/bar/author");
String name = node.valueOf("@name");
}
For example if you wish to find all the hypertext links in an XHTML document the following code would do the trick.
public void findLinks(Document document) throws DocumentException {
List<Node> list = document.selectNodes("//a/@href");
for (Iterator<Node> iter = list.iterator(); iter.hasNext();) {
Attribute attribute = (Attribute) iter.next();
String url = attribute.getValue();
}
}
If you need any help learning the XPath language we highly recommend the Zvon tutorial which allows you to learn by example.
If you ever have to walk a large XML document tree then for performance we recommend you use the fast
looping method which avoids the cost of creating an Iterator
object for each loop. For example
public void treeWalk(Document document) {
treeWalk(document.getRootElement());
}
public void treeWalk(Element element) {
for (int i = 0, size = element.nodeCount(); i < size; i++) {
Node node = element.node(i);
if (node instanceof Element) {
treeWalk((Element) node);
}
else {
// do something…
}
}
}
Often in <dom4j> you will need to create a new document from scratch. Here's an example of doing that.
import org.dom4j.Document;
import org.dom4j.DocumentHelper;
import org.dom4j.Element;
public class Foo {
public Document createDocument() {
Document document = DocumentHelper.createDocument();
Element root = document.addElement("root");
Element author1 = root.addElement("author")
.addAttribute("name", "James")
.addAttribute("location", "UK")
.addText("James Strachan");
Element author2 = root.addElement("author")
.addAttribute("name", "Bob")
.addAttribute("location", "US")
.addText("Bob McWhirter");
return document;
}
}
A quick and easy way to write a Document
(or any Node
) to a Writer
is via the write()
method.
FileWriter out = new FileWriter("foo.xml");
document.write(out);
out.close();
If you want to be able to change the format of the output, such as pretty printing
or a compact format, or you want to be able to work with Writer
objects or OutputStream
objects as the destination, then you can use the XMLWriter
class.
import org.dom4j.Document;
import org.dom4j.io.OutputFormat;
import org.dom4j.io.XMLWriter;
public class Foo {
public void write(Document document) throws IOException {
// lets write to a file
try (FileWriter fileWriter = new FileWriter("output.xml")) {
XMLWriter writer = new XMLWriter(fileWriter);
writer.write( document );
writer.close();
}
// Pretty print the document to System.out
OutputFormat format = OutputFormat.createPrettyPrint();
writer = new XMLWriter(System.out, format);
writer.write( document );
// Compact format to System.out
format = OutputFormat.createCompactFormat();
writer = new XMLWriter(System.out, format);
writer.write(document);
writer.close();
}
}
If you have a reference to a Document
or any other Node
such as an Attribute
or Element
, you can
turn it into the default XML text via the asXML()
method.
Document document = …;
String text = document.asXML();
If you have some XML as a String
you can parse it back into a Document
again using the helper method DocumentHelper.parseText()
String text = "<person> <name>James</name> </person>";
Document document = DocumentHelper.parseText(text);
Document
with XSLT
Applying XSLT on a Document
is quite straightforward using the JAXP
API from Oracle. This allows you to work against any XSLT engine such as Xalan or Saxon. Here is an example of using JAXP to create a transformer
and then applying it to a Document
.
import javax.xml.transform.Transformer;
import javax.xml.transform.TransformerFactory;
import org.dom4j.Document;
import org.dom4j.io.DocumentResult;
import org.dom4j.io.DocumentSource;
public class Foo {
public Document styleDocument(Document document, String stylesheet) throws Exception {
// load the transformer using JAXP
TransformerFactory factory = TransformerFactory.newInstance();
Transformer transformer = factory.newTransformer(new StreamSource(stylesheet));
// now lets style the given document
DocumentSource source = new DocumentSource(document);
DocumentResult result = new DocumentResult();
transformer.transform(source, result);
// return the transformed document
Document transformedDoc = result.getDocument();
return transformedDoc;
}
}
All artifacts in Maven repository are signed with following PGP key:
-----BEGIN PGP PUBLIC KEY BLOCK-----
Comment: Thumbprint: 8F9A3C6D105B9F57844A721D79E193516BE7998F
mQENBFWdcSoBCADK8j+0eVZKUGctZo/VaJ/K2Wppx4jEFgih8xiIWREQ9B3QEugJ
mJMWZHhrnHB+sjVx5No482ch6sVhYmC+VMyTdzepItZ8beYa0pnNGJnrFT+HcTOS
g21Ef5e6BRORNho2j9YTvxvjof29XxU4SJFVgffs48jGeJzN1EDmOz4OlZupKGU+
98o+kMKCiFjcf6Vu03asuml97b2fMOJ09n+UQVlZbBR/Yo407ZLkL2Elx47Fz+82
iO+M8w2qNnxT4PA/TLgaVzkVHaR/JIDlQQ4DfuyloQI1hBpMB8f60oukVr5dBGuS
1dPZ1H7td975sLegWoj7CCOFZXrDzYUXwwXPABEBAAG0IEZpbGlwIEppcnPDoWsg
PGZpbGlwQGppcnNhay5vcmc+iQE5BBMBCAAjBQJVnXEqAhsDBwsJCAcDAgEGFQgC
CQoLBBYCAwECHgECF4AACgkQeeGTUWvnmY/Z6wgArX9fzySIVWcqFuhaIlRlib9j
1qE3sSiFVENV4NrCYv+4ZUQUEUvqwX0F4ij+Au9fzvaWb0gT02ErHYJ9UowUgUYb
IdHsifoGh05jZdiClXJutcQHddM+P+ReIAS4/JDlXza1kqa1RRvDh+OtsrDkL1MU
a5T6/KbCWzAj9+96vqa2dLO0mhyrPpVX/hF4tPY6ltGYEXA9N3c83rFmaCZTNM0t
sEQniQMICOMZul2dKJ4Tev12/G9sd4owtlHtAtv0+tFPDMPQAXjToUo36q9MIzKE
Cyz5sX64QRablAJc7QD4MFI/7J6eQdpSKM77QaL48kcUAK1j9nlXv+oj/1d437kB
DQRVnXEqAQgAonYI6XgMnKL5jj1n/3kVxKA+4m0znSoMutK3B2D3geqTzFWlDIWU
EOEE00U2mBMPUibQ9orbu5IYrbXLR6t0QORJiHudP3LxdtjIqXCagdzCewJ0Kfvd
pR/a65dsULLu4+v8R7KBH+lBVs0aN0z8e539ZaoGPCVaWliybbHwcry4tOMu9wyB
dPlt0pkqQ7y+YerXgHO+hc9urQVY9zHVBRe1J2vqzFONitFlD5BoT386pz8tBi0W
32J46nTgReukzJWLbtV53fxYAFUroA7Ydy2xYKQ2yVqBq9NraUNqbdtlEhJRDS3W
eQs4ittg+oyMumIdNjSbUlbDX0O7EP16KQARAQABiQEfBBgBCAAJBQJVnXEqAhsM
AAoJEHnhk1Fr55mPAAUH/itFMvGq/ri1alRXhLbhx8/HmwBBkgS8wCu/oIIPEZ4W
jRB8EfEYAMbmqtmbGFc/lL2QSxvqAcsUGFlVqRe+Ux9LilQx/84zvD6aG90eTzfF
pNUHkgBOS7poRbDggVaCSuDYKiyTc07hHNl4iZON3VSiOaXf/4rzbIzv0n0swc0s
00N1IcwI/pP+74t+tmfH4PUjZwUC6cXHMHSfvImAO2hPMAbd3rJ/ZO/ZVwjNocjR
5fQj/MSOgl5hiXEkuBdoqoD0lTJMYCwPgwPGNcBr2xeXOKxeIlbYGwh/j3AsK0Op
uqUJfZ5wvADbdmco+6Piann1q0WvhfmRaie7IPG2tB0=
=ZbfA
-----END PGP PUBLIC KEY BLOCK-----