Archive | Basic XML | XSL Transforms | Projects | About

XML Namespaces

In order to “play well with others” in the XML world, you need to understand and use XML Namespaces. The good news is that namespaces are easy to use. XML Namespaces is among the shortest XML-related specifications and can be read in about fifteen minutes. I only recommend reading most of the XML-related specifications, however, if you are having trouble sleeping.

Namespaces have a reputation for being confusing and frustrating. If you use namespaces incorrectly, you will indeed be signing yourself up for some frustration. And there are some esoteric points with namespaces that are difficult to understand. Most of the frustration comes from the fact that you can use many XML technologies successfully without using namespaces, but once namespaces are on the scene, some of your coding needs to be adjusted to account for them.

You don’t have to do much to support namespaces, so it’s best to use namespaces from the start and avoid code changes later. In this essay we’ll learn the basics of namespaces and key programming areas affected by their use. In particular, we’ll look at the namespace impact on XPath because it is central to both DOM and XSL processing.

The following XML sample has no namespace defined.

1 |<List name="Fruit List">
2 |   <Item>Apple</Item>
3 |   <Item>Banana</Item>
4 |</List>

Here is the same sample with a default namespace set. All the elements and attributes in this document are in the same namespace.

1 |<List xmlns="http://liquidhub.com/SimpleList" name="Fruit List">
2 |   <Item>Apple</Item>
3 |   <Item>Banana</Item>
4 |</List>

IRI, URI, We’re All Al-RI-ght for Namespaces

The value of the namespace xmlns attribute is a uniform resource identifier (URI). The URI should be unique and persistent for all time. A uniform resource locator (URL) is often used as the base for namespace URIs because the domain name registration system provides a recognized central authority for unique name ownership. A URL is a URI scheme for HTTP and other location dependent Internet protocols.

The XML Resource Directory Description Language (RDDL) is a standard way of describing information about a resource. It’s based on XHTML and XLINK and is ideally suited for providing additional information pertaining to an XML Namespace. RDDL is also a great example of building an XML application from standard XML components.

For namespaces, the location aspect of URIs is not significant. No document has to reside at the location specified—though sometimes a DTD, XML Schema or other information can be found at the location. It’s not bad practice to serve up some helpful documentation or links from the URL associated with a namespace.

A uniform resource name (URN) is also commonly used for namespace URIs. URN was developed as a more constrained form of URI that you can use to define name-based, location-independent identifiers. Some examples of URN use are the ISBN for books (urn:isbn:0-345-39180-2), ISSN for serial publications (urn:issn:1075-2838), and the IETF requests for comments documents (urn:ietf:rfc:3986).

The internationalized resource identifier (IRI) is simply the evolution of the URI into an international character set. URIs only allow for a subset of us-ascii characters in their names. IRIs expand URIs to include UTF-8 characters. UTF-8 is an encoding of the universal character set (Unicode/ISO 10646). Any recently updated XML specifications favor IRIs to URIs.

The IETF RFC 3986 is a good place to start finding out more about URIs. IRIs are specified in IETF RFC 3987. Tim Berners-Lee wrote an excellent article on the design of URIs called “Cool URIs Don’t Change”, definitely worth a quick read.

Namespace Prefixes

Prefixes are simply shorthand for the full namespace URI. The following sample demonstrates a namespace prefix lh used for the namespace declaration and also prefixes all the element names in the “http://liquidhub.com/SimpleList” namespace with the lh: prefix:

1 |<lh:List xmlns:lh="http://liquidhub.com/SimpleList" 
2 |   name="Fruit List">
3 |   <lh:Item>Apple</lh:Item>
4 |   <lh:Item>Banana</lh:Item>
5 |</lh:List>

It’s not required to prefix all the Item elements with lh: because the lh: on the List element cascades to the contained elements. Without the namespace prefix mechanism, each element would have to include the namespace declaration attribute to achieve the same effect as prefixing all elements. Choosing to prefix all elements or to rely on namespace cascading is a style decision; both methods accomplish the same result. The web log sample below gives some insight into namespace prefix usage tradeoffs.

Most namespaces have a generally accepted prefix associated with them, xsl: for style sheet transforms, xlink: for XML linking language, but it is not necessary to always use the same prefix. The URI must remain constant, but the same document with pig: as the prefix instead of lh: would carry the same namespace information.

Namespaces in Use

If you were designing an XML application to store a web log, parts of your markup would be dedicated to metadata for each web log post and parts for the post content itself. Metadata elements like Author, Category and PostedDate would likely belong in the same namespace. For the post content of each web log entry you may choose to leverage XHTML formatting elements from the XHTML namespace within your markup. Namespaces enable multiple markup languages to be assembled into new markup languages without name collisions. Namespaces enable modular XML.

The following sample shows elements from two namespaces combined in our hypothetical web log XML format:

1 |<blog:Blog xmlns:blog="http://liquidhub.com/Blog">
2 |   <blog:Entry>
3 |         <blog:Author>Sam Page</blog:Author>
4 |         <blog:Category>General</blog:Category>
5 |         <blog:PostedDate>20050301</blog:PostedDate>
6 |         <xhtml:body xmlns:xhtml="http://www.w3.org/1999/xhtml">
7 |               <p>Today is a <b>good</b> day.</p>
8 |               <ul>
9 |                     <li>Read a book</li>
10|                     <li>Listen to music</li>
11|               </ul>
12|         </xhtml:body>
13|   </blog:Entry>
14|</blog:Blog>

The sample avoids namespace prefixing every XHTML element within each post by using namespace cascading. Because other namespaces are not going to be mixing within the XHTML content, cascading is a reasonable approach.

*Dublin Core Metadata
The Dublin Core metadata element set is a common set of elements for providing bibliographical information about a document. Title, Creator, Subject, Date, and Language are some of the Dublin Core metadata element names. Without namespaces, you can see how Dublin Core would be a lot less useful because of inevitable name collisions with common element names like Title. Dublin Core demonstrates how namespaces enable modular XML.

All the elements in the blog namespace are prefixed in the sample because it’s clearer when mixing namespaces among child elements. In this case, it’s not entirely necessary to prefix, but if we were to add additional namespaces, say Dublin Core* metadata elements, the clarity advantage of prefixing would be more apparent.

Another reason to favor prefixes is that all the blog elements, whether explicitly prefixed or implicitly cascaded, belong to the blog namespace and must therefore in code be referenced with the namespace. The prefix in the XML document serves as a reminder to include the namespace in code references.

Namespaces and XPath

XPath is the basis for querying and manipulating XML DOM trees and also the pattern language used in XSL transforms. Because these technologies are fundamental to working with XML documents, every developer should be familiar with them.

Namespaces add some minor complications to how XPath expressions work because of the many valid ways that namespaces may be declared or prefixed. All XPath implementations that support namespaces have a method for managing namespaces similar to the one described below for the Microsoft .NET platform.

XPath Namespace Handling in Microsoft .NET

Every DOM and XPath implementation requires a method to communicate the namespace URI and prefix mapping for XPath expressions. In Microsoft’s .NET implementation, the XmlNamespaceManager provides this mapping. The following two samples show typical namespace uses in .NET.

1 |// using XmlNode.selectNodes method with XmlDocument
2 |XmlNamespaceManager nsmgr =
3 |   new XmlNamespaceManager( doc.NameTable );
4 |nsmgr.AddNamespace( "lh", "http://liquidhub.com/SimpleList" );
5 |XmlNodeList nodeList = doc.SelectNodes( "//lh:Item", nsmgr );

Note that at the end of line five, the namespace prefix lh: is used in the XPath expression “//lh:item”. This only works because we associated the prefix lh: with the namespace URI in line four. The source document could have used a default namespace or a different prefix than the one used in our code, but we avoid having to code for the many possible legal namespace declarations by mapping a single prefix that will be consistently used in our code. This leaves source documents free to use whatever method they want to declare the namespaces.

The next code sample accomplishes the same prefix mapping with the more granular interfaces in the Microsoft .NET XML services.

1 |// using XPathNavigator and XPathExpression
2 |XmlNamespaceManager nsmgr =
3 |   new XmlNamespaceManager( nav.NameTable );
4 |nsmgr.AddNamespace( "lh", "http://liquidhub.com/SimpleList" );
5 |XPathExpression expr;
6 |expr = nav.Compile( "//lh:Item" );
7 |expr.SetContext( nsmgr );
8 |XPathNodeIterator iterator = nav.Select( expr );

Because the .NET XML classes provide versions of the SelectSingleNode and SelectNodes methods that don’t require a namespace manager, it’s easy to write XPath expressions and code that are not namespace aware. If you don’t account for namespaces in your XPath expressions from the start, you’ll be faced with the painful task of re-coding to use the namespace manager and prefixes in all of your XPath expressions later.

section break

Namespaces can no doubt look ugly in an XML document. Many texts teaching basic XML leave namespaces out of examples to avoid complicating things. From a programming perspective, it’s better to account for namespaces from the beginning of a project to avoid messy problems later. Though you can get pretty far without dealing with namespaces, not being comfortable with namespaces is a distinct disadvantage. Namespaces are easy, so get on board!

References

Namespaces in XML 1.1
http://www.w3.org/TR/xml-names11
Uniform Resource Identifier (URI): Generic Syntax
http://www.ietf.org/rfc/rfc3986.txt
Internationalized Resource Identifiers (IRI)
http://www.ietf.org/rfc/rfc3987.txt
Hypertext Style: Cool URIs Don’t Change
http://www.w3.org/Provider/Style/URI.html
Dublin Core Metadata Elements Set, Version 1.1
http://dublincore.org/documents/dces
XML Resource Directory Description Language
http://www.rddl.org/