Pages

Wednesday, August 17, 2005

XML and XSL Reuse: Leveraging XML XInclude with Xerces and Xalan

XInclude is a recommended XML specification from the W3C. It essentially provides an alternative to DTDs external entity references. A good overview of the differences is provided on Elliote Harold's blog. The most appealing reason for using XInclude is that XML includes are fully well-formed documents that can be processed individually.

Let's take a simple example. Let's assume that we have a ContactInfo.xml document:
<?xml version="1.0" encoding="UTF-8"?>
<hrxml:ContactInfo xml:lang="EN"
xmlns:hrxml="http://ns.hr-xml.org/2004-08-02"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:noNamespaceSchemaLocation="../schemas/hrxmlResume-2.3.xsd"
xmlns:xi="http://www.w3.org/2001/XInclude">

<xi:include href="contactinfo/PersonName.xml"/>
</hrxml:ContactInfo>
which includes a PersonName.xml:
<?xml version="1.0" encoding="UTF-8"?>
<hrxml:PersonName xml:lang="EN"
xmlns:hrxml="http://ns.hr-xml.org/2004-08-02"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:noNamespaceSchemaLocation="../../schemas/hrxmlResume-2.3.xsd"
xmlns:xi="http://www.w3.org/2001/XInclude">

<hrxml:GivenName>David</hrxml:GivenName>
<hrxml:FamilyName>Le Strat</hrxml:FamilyName>
</hrxml:PersonName>
Each document is fully well-formed and can be transformed individually. In this example, we can create a htmlPersonName.xslt stylesheet to format a person name:
<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:hrxml="http://ns.hr-xml.org/2004-08-02"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">

<xsl:template match="hrxml:PersonName">
<xsl:value-of select="hrxml:GivenName" />
<xsl:text> </xsl:text>
<xsl:value-of select="hrxml:FamilyName" />
</xsl:template>
</xsl:stylesheet>
Leveraging Xalan (2.7.0), we can apply the stylesheet to the well-formed PersonName.xml document.

To do so, we can leverage the Xalan command line utility. In this particular example, we invoke the utility through an ant script as described below:
<java classname="org.apache.xalan.xslt.Process" fork="true" dir="." >
<jvmarg value="-Djavax.xml.parsers.DocumentBuilderFactory=
org.apache.xerces.jaxp.DocumentBuilderFactoryImpl"/>
<jvmarg value="-Djavax.xml.parsers.SAXParserFactory=
org.apache.xerces.jaxp.SAXParserFactoryImpl"/>
<jvmarg value="-Dorg.apache.xerces.xni.parser.XMLParserConfiguration=
org.apache.xerces.parsers.XIncludeParserConfiguration"/>
<arg value="-IN"/>
<arg value="${home.dir}/components/contactinfo/${xml.name}.xml"/>
<arg value="-XSL"/>
<arg value="${home.dir}/styles/html${xml.name}.xslt"/>
<arg value="-OUT"/>
<arg value="${home.dir}/output/${xml.name}.html"/>
<arg value="-HTML"/>
<classpath>
...Your classpath...
</classpath>
</java>
The jvmarg are of particular interest as they enable the processing of XML XInclude with the Xerces parser.

In addition, for rendering the ContactInfo.xml, we can leverage XSL include as follow:
<?xml version="1.0" encoding="UTF-8"?<
<xsl:stylesheet version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:hrxml="http://ns.hr-xml.org/2004-08-02"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">

<xsl:include href="htmlPersonName.xslt" />

<xsl:template match="hrxml:ContactInfo">
<xsl:apply-templates select="hrxml:PersonName" />
</xsl:template>
</xsl:stylesheet>
And voila, we have achieved a high level of reusability of both presentation and data components!

Thursday, August 04, 2005

Cost benefits of Unit Testing

Many definitions are available for unit testing but, in simple terms, a unit test is a method for testing the correctness of a particular module of source code. Sounds like something most software development organizations would want to do... In the real world, however, the issue of cost benefits often comes up. It is indeed very hard to talk about the benefits of unit testing without knowing the answer to the following question [1], [2]: How many defects did unit tests avoid, how much time was saved and, how much time and defects will be saved in the future?
Over the next few weeks, I will blog on the cost benefits of unit testing and how to quantify such benefits. Commonly quoted benefits of unit testing are [3], [4], [5], [6]:
  1. Problems are found early in the development cycle.
  2. Code that works now, will work in the future.
  3. New features will not break existing functionality.
  4. Making change becomes easier, as controls are in place.
  5. The development process becomes more flexible.
  6. Implementation design is improved as APIs are forced to be more flexible and unit-testable.
  7. Bringing new developers on board becomes easier and improve teamwork. Unit tests document the code.
  8. The need for manual testing is reduced.
  9. The development process becomes more predictable and repeatable
However, quantifying such benefits is often challenging for organizations. Still, everyone agrees that the cost of fixing bugs or changing software increases exponentially the later issues are uncovered in the software life cycle. In a development process where identifying bugs is the responsibility of the quality assurance (QA) group, the QA group risks to run into
bug indigestion
. Unit testing is critical to prevent unit-level issues to be uncovered later in the software development life cycle. Software methodologies that rely heavily on developer unit-level testing can therefore achieve a much lower cost of ownership throughout the software development life cycle.
Cost Of Change Curves

Enforcing unit-level testing throughout the development process should be a major focus of any management team involved in managing the delivery of a software product or solution. Identifying the optimal amount of unit-level testing to maximize the benefits of writing unit tests is hard to measure. Best practices suggest that the ratio of test code to code under test required to achieve at least 90% code coverage is between 2/1 and 4/1. This means that to thoroughly test a 100-line Java class requires 200 to 400 lines of test code. The higher the unit-level test coverage, the better the quality. This suggests a best case scenario and does not necessarily maximizes the return on writing unit tests. My experience suggests that the benefits of unit testing can be achieved much sooner but also suggests that there seems to be a threshold above which the benefits of unit testing can be most felt. Let's take, as an example one of the projects I recently completed. As illustrated below, the initial phases of QA resulted in a flood of bugs. At the same time, levels of unit tests were insufficient. I would draw a first lesson from this observation:
To maximize, the cost benefits of unit testing; start writing tests early. Focus on "quality" assertions where assertions are performed against relevant data.

As testing went on, I believe that we can clearly identify the point of inflexion where the value of unit tests really shows. As illustrated below, as our unit tests level improved, our development team was able to significantly improve our bugs closing rate without introducing new issues and therefore avoid a bugterial infection. Therefore, the second key lessons:
Good unit-level test coverage allows projects to significantly shorten their bug-fixing cycle. Therefore, resulting in direct cost benefits.

Bug Discovery Stats

Unit Testing Stats

Quantifying cost benefits depends on the type of project under scrutiny, however simple empirical data clearly demonstrates such benefits. As a result if you are a developer, you should be writing unit tests today; if you are a manager, you should be driving adoption of unit-level testing practices. The quality of your software and your ability to respond to your customers demand will significantly improve. Be agile, today!