Archive | Basic XML | XSL Transforms | Projects | About

XSL Code Generation

Code generation is more than just a technique to save you time with tedious programming tasks. Code generation is a way to sharpen your software design and abstraction skills and to capture programming know-how into a reusable toolset. Using XML and XSL together forms an ideal platform for code generation. Data and metadata captured in an XML structure drives XSL code generation that’s flexible enough for just about any need.

In an area like web development, where there are so many technologies that come together to form a single solution, code generation provides a foundation to build deep solutions that cover much of the detail-work effortlessly. For example, you can build a nice reusable JavaScript/HTML/CSS menu tool with code generation. A menu XML file holds just the content describing the menu. The XSL code produces the HTML and necessary script and style hooks for the JavaScript and CSS code. Only the menu XML file changes with each project, the code generation and supporting files stay the same. In effect you have created a ‘menu language’ with an accompanying code generating interpreter.

Code generation frees you to focus on a particular area of an implementation. For example, once you have a menu language, you can focus on making the menu’s appearance more easily adjustable. Perhaps by creating several style sheets, each with a different look-and-feel, you expand your menu language’s capabilities and therefore can bring more craft to a solution quickly. When you improve part of the implementation, capture the enhancements into your code generation process and strengthen your toolset.

New technology and the need to switch development environments from one project to the next is another aspect of development where a code generation approach helps developers considerably. If you have a code generation solution for form validation in ASP and need one for Cold Fusion, simply extend your ASP solution to produce Cold Fusion code. There’s nothing difficult about server-side form validation that can’t easily be implemented on both platforms. By solving for both platforms from a common base, your code generation toolkit becomes that much more powerful.

Take this process a few steps further. Build a code generation solution that abstracts out portions common to server-side form validation and encapsulate them neatly. This allows you to focus on just the differences particular to each target platform. Now adding JSP or PHP, likely very similar to the other two solutions, is just a matter of adding the code particular to each framework. This kind of abstraction could be equally applied to generating database code for Oracle, SQL Server, and DB2 or to generating classes for .NET or Java.

Incorporating new platforms and languages into your code generation toolset also provides a targeted goal for learning. It always helps to have a project to immerse in while learning something new. Adapting your code generation solution to a new technology provides an interesting context for learning.

Domain Specific Languages

Create a framework for solving a problem first, and then solve the problem within that framework. Build libraries of code that raise the expressive power of your programming language, and then complex tasks become easy. When you create a domain specific language, you are creating a highly specialized way of talking about the solution to a class of related problems. The most powerful forms of code generation start with domain specific languages.

One way of thinking about XML is that XML is a language for creating simple languages. The languages you create with XML are not programming languages but rather domain specific languages. XML is not just a text file with angle brackets! The XML core technologies bring all the tools you need to load, parse, validate, and navigate an in-memory representation of your XML-based domain specific language. XSL adds a powerful transformation language for producing any text format you can imagine. This is all the machinery you need to develop custom code generation tools.

Example: A Pipeline Language

I use XSL pipeline processing in a lot of code generation solutions. I realized it would be helpful to have a simple language for specifying the transforms involved in the pipeline and to build a program that executes such a pipeline. (This is the XslPipe project that will appear on this site in the near future.)

Here is an example of the pipeline language I came up with:

1 |<?xml version="1.0"?>
2 |<pipeline xmlns="">
3 |   <transform src="table.xsl"/>
4 |   <transform src="sort.xsl">
5 |         <param name="order">desc</param>
6 |   </transform>
7 |   <transform src="page.xsl">
8 |         <param name="pageNum">1</param>
9 |         <param name="pageSize">25</param>
10|   </transform>
11|   <transform src="htmlTable.xsl">
12|         <output filename="table.html"/>
13|   </transform>

This pipeline produces a sorted and paged HTML table from a dataset provided in XML. Simple pipelines are just a few lines of code in this pipeline language. More complex pipelines with branches and multiple outputs are also possible. This language has less than a dozen elements and attributes and provides a useful code generation tool.

For the implementation of this pipeline language I use a combination of XML Schema and XSL to validate the pipeline documents and a DOM tree to hold them in memory. A simple walk through the pipeline using the DOM API, loading style sheets and executing transforms along the way and I had a working utility. Once you’ve built the framework for a utility like XslPipe, it’s easy to adapt the approach to other mini languages.

When designing a utility language, you typically begin by designing the XML. It’s best to start with several sample XML instances. Gradually you develop the validation code while you flesh out your XML data model. Once you’re mostly satisfied with your language, begin your implementation. If your implementation consists of running an input document through a series of XSL transforms to produce code, you’ll be done pretty quickly. This is a powerful programming pattern.

section break

The remainder of this essay covers a few useful XSL code generation techniques.

Controlling Space

In code generation applications in particular, being able to control white space is often critical to the generated code. The easiest way to handle white space in XSL is to avoid mixed content in your templates. The XSL processor behaves differently with mixed content in templates.

Here is an XSL template with mixed content:

1 |<!-- Template with mixed content -->
2 |<xsl:template match="Course">
3 |   <a>
4 |         </xsl:attribute name="href">
5 |               ../Course.aspx?courseID=
6 |               <xsl:value-of select="@id"/>
7 |         </xsl:attribute>
8 |         <xsl:value-of select="@name"/>
9 |   </a>

The problem with this template is that the mixed content on line five is going to cause the XSL processor to include all the white space from the end of line four to the xsl:value-of tag at the beginning of line six in the output. In this case, it would create and HTML href attribute that is likely invalid with tabs and new lines in the middle.

There are two ways to fix this problem, collapse all of the code on lines four through seven into one line, or use the xsl:text element to avoid mixed content. The first method is messy and would quickly become difficult to maintain. The second is clear and produces the correct output:

5 |   <xsl:text>../Course.asp?courseID=</xsl:text>

Use xsl:text to avoid mixed content and to control white space, especially when you’re creating a text format output from your transform.

Coming up with clear and brief samples is tougher than it looks!

By the way, a third (and even better) way to fix the code above is to use XSL’s curly brace attribute expansion:

3 |   <a href="../Course.asp?courseID={@id}">
4 |         <xsl:value-of select="@name"/>
5 |   </a>

Avoiding mixed content is the best guideline for controlling white space during code generation. The XSL preserve-space and strip-space elements can also be used to control white space, but they are often too coarse-grained to be useful because they are global to the style sheet. I’ve never needed to use either of them for code generation.


Sometimes you need to add a little information to an XML document for driving your code generation. A little pre-processing of the XML, adding some information here and there, can simplify your code generation process considerably. Because code generation is a complex process, it’s a candidate for using an XSL pipeline processing approach. When you add information to your XML, you are decorating the XML. XSL identity transforms form the basis for easily implementing XML decoration.

In the following example, a new child BaseName element is created on each Person element in the output XML as part of an identity transform (template not shown):

1 |<!-- Decorate Person elements with BaseName elements -->
2 |<xsl:template match="Person">
3 |   <xsl:copy>
4 |         <BaseName>
5 |               <xsl:value-of select="concat(
6 |                     translate(LastName,’ ’,’_’), 
7 |                     translate(FirstName,’ ’,’_’))"/>
8 |         </BaseName>
9 |         <xsl:apply-templates select="@* | node()" />
10|   </xsl:copy>

The BaseName element could be used to create file names for intermediate files produced during the code generation process or for generated class or variable names. Keeping this kind of pre-processing work out of your main code generation transform helps make the code generation easier to maintain.

Because decoration is such a useful pattern for code generation, let’s take this example a bit further by including namespaces. In practice, it’s useful to keep our XML decorations in their own namespace:

1 |<!-- Full decoration sample with namespace -->
2 |<xsl:stylesheet version="1.0"
3 |   xmlns:xsl=""
4 |   xmlns:lh="">
5 |
6 |<!-- IdentityTransform -->
7 |<xsl:template match="/ | @* | node()">
8 |   <xsl:copy>
9 |         <xsl:apply-templates select="@* | node()" />
10|   </xsl:copy>
13|<!-- Decorate Person elements with BaseName elements -->
14|<xsl:template match="Person">
15|   <xsl:copy>
16|         <xsl:element name="lh:BaseName"
17|               namespace="">
18|               <xsl:value-of select="concat(
19|                     translate(LastName,’ ’,’_’), 
20|                     translate(FirstName,’ ’,’_’))"/>
21|         </xsl:element>
22|         <xsl:apply-templates select="@* | node()" />
23|   </xsl:copy>

The key adjustments for namespaces are simple. First the decoration namespace is added to the style sheet with the prefix lh: on line four. Then where we previously used literal BaseName elements, we use the xsl:element method instead with a namespace attribute (line 16). Note that the prefix is included in the name attribute.

By keeping your decorations in their own namespace, you can more easily write transformations that act on specific decorations tied to any XML. Such transforms might be useful across many code generation scenarios.

Decorations can be helpful whether they’re automatic like the one above or manually added by the developer with a text editor. Adding documentation-producing decorations to an XML instance is a useful practice. Sometimes manual decoration is needed to hint a code generation process to produce different code under certain circumstances.

section break

You don’t want my code generation tools, though I’d gladly share them with you. You must author your own code generation tools to get the most benefit. Code generation tools capture your knowledge, your approach, and your insight. It’s always meant to be a work in progress, incorporating new things, refactoring, and learning from your experiences.

I hope the XSL pipeline approach presented in this essay inspires you to give code generation with XML and XSL a try. I continue to experience a lot of success in my consulting engagements following these techniques, and I believe you will too.