The i-Technology Media!
Register | Log in
   
 
.NET  ·  AJAX  ·  CLOUD  ·  ECLIPSE  ·  FLEX  ·  OPEN WEB  ·  iPHONE  ·  JAVA  ·  LINUX  ·  OPEN SOURCE  ·  ORACLE  ·  PBDJ  ·  SEARCH  ·  SILVERLIGHT  ·  SOA  ·  VIRTUALIZATION  ·  WEB 2.0  ·  WIRELESS  ·  XML
Comments
Drool, Britannia? Is the UK Failing the Cloud?
By Roger Strukhoff
Richard Davies wrote: The UK has a good crop of technology pioneers in cloud computing - for example ElasticHosts, FlexiScale, Flexiant, OnApp - and also some strong government initiatives such as G-Cloud. We will have to see whether this kind of technical leadership converts into swift mass-market adoption or not.
Jan. 8, 2012 11:38 AM EST
read more & respond »
Cloud Expo on Google News
Did you read today's front page stories & breaking news?

Cloud Expo & Virtualization 2011 West
Keynotes
Oracle
Opening Keynote | An Enterprise Cloud for Business-Critical Applications
Abiquo
Day 2 Keynote | The Enterprise Cloud Tightrope - Balancing for Success
Akamai
Day 3 Keynote | The DNA of an Enterprise Cloud
DIAMOND SPONSOR:
Oracle
Many Clouds, Many Choices'Cloud
PLATINUM PLUS SPONSORS:
Abiquo
Enterprise Cloud Best Practices - Town Hall - Join the discussion…
PLATINUM SPONSORS:
Intel
Progressing Toward the Federated, Automated and Client-Aware Cloud
New Relic
How to build an app with Twitter-like throughput
Rackspace
Computing in the Cloud Era
GOLD SPONSORS:
Gale Technologies
Practical Cloud Migration
IBM
Re-think IT. Re-inventing Business.
Intel/McAfee
Identity Driven Security in the Cloud
PerspecSys
Hackers Hackers Everywhere, Is My Public Cloud That Safe?
Red Hat
Unlock the Value of the Cloud
SHI
Mission Critical Applications and the Cloud - Myth or Reality?
SoftLayer
Not Your Grandpa's Cloud
Terremark
Integrating Enterprise Clouds
VMware
Upgrade to a vCloud
POWER PANELS:
Cloud Expo Silicon Valley: CTO Power Panel
Cloud Expo Silicon Valley: CEO Power Panel
Cloud Expo Silicon Valley: Cloud SuperStars Panel
Cloud Expo Silicon Valley: CloudNOW Panel
Click For 2010 West
Event Webcasts
Cloud Expo & Virtualization 2011 East
DIAMOND SPONSOR:
Dell
Dell & VMware Deliver the Enterprise Hybrid Cloud
PLATINUM PLUS SPONSORS:
Abiquo
Are Financial Services Organizations Risking Security by Avoiding Cloud Computing?
Oracle
From Consolidation to Enterprise Private PaaS
PLATINUM SPONSORS:
Intel
Driving the Transformation to Next Generation Cloud Data Centers
Rackspace
The Inevitability of an Open Cloud
GOLD SPONSORS:
CA Technologies
Follow YOUR path to Cloud Computing
Interxion
Who Keeps the Cloud in the Air?
Microsoft
Patterns for Cloud Computing
PerspecSys
War in the Clouds: Are you ready?
ServiceMesh
The Big Win: Stop Playing Small-Ball with Your Cloud Strategy
Terremark
Evaluating Enterprise Clouds
Xiotech
Cloud Storage: Myths and Realities
POWER PANELS:
Cloud Expo New York: CTO Power Panel
Cloud Expo New York: CEO Power Panel
Cloud Expo New York: CMO Power Panel
Cloud Expo New York: Wrap-Up Power Panel
Click For 2010 West
Event Webcasts
Live Google News by SYS-CON!
Top Three Links You Must Click On


Feature
Multipass Validation with XSD and Schematron Part 2
Multipass Validation with XSD and Schematron Part 2

By: David Kershaw; Eric Schwarzenbach
Aug. 6, 2003 12:00 AM

In Part 1 of this article (XML-J, Volume 4, issue 7) we outlined why a development group might consider alternative validation schemes. An example from our experience is applying work group rules to the process of XSD design. We said rules could take the form of a Schematron schema that would be applied when a developer validates an XSD against the schema for XSD. In our past work, a need existed for a productive way to put the alternative into play without losing familiar tools or disrupting current development patterns. For that reason we developed a simple multipass validation framework using XMLSPY's Scripting Environment. In this installment, we walk you through the scripting that remains to be set up and then look at how our framework can be a productive tool for your real-world use.

Finishing the Global Functions
At this point your "(GlobalDeclarations)" should include stubs for all the functions and global variables you have seen so far. They should have the same arguments and be in a form similar to what is shown in Listing 1.

Our implementation of these functions is uncomplicated, but there are a few areas to highlight, including where we use XMLSPY's object model and our use of the Windows Scripting Host.

We chose to read and write processing instructions holding validation commands because of the relative simplicity of using Altova's XMLData object; however, since most developers are more familiar with the DOM, let's take a closer look. Altova uses the XMLData interface instead of the DOM because the internal needs of developer's tools are unique and because XMLData can easily interoperate with the DOM (you can see examples of this in the XMLSPY Code Generator examples).

We factored code for finding all PIs into a pair of functions. The lack (on the public API, at least) of a count property giving advice on the number of children of an XMLData node is notable. Until this oversight is addressed, the easiest workaround is to catch the exception XMLSPY throws at end of the list (see Listing 2; Listings 2-7 can be found at www.sys-con.com/xml/sourcec.cfm).

"getAllPI" returns the set of PIs found at or below the node passed in. To get a reference to the root element of the current document you simply walk the object model starting at "Application".

var doc = Application.ActiveDocument;
var xml = doc.RootElement ;
var array = getAllPI(xml);

Another tricky point is where we save the document after adding a new PI in the "writePI" method. The fragment below shows how to create and insert the PI. Then, before you can save you need to switch into Grid view. We found that Text view did not save the document and update the view reliably. After saving we switched the view back, if needed, to avoid disorienting the user.

var doc = Application.ActiveDocument;
var xml = doc.RootElement ;
var newchild = doc.CreateChild( 9 );
newchild.Name = name;
newchild.TextValue = value;
xml.AppendChild( newchild );
var mode = doc.CurrentViewMode;
if ( mode != 0 ) doc.SwitchViewMode( 0 );
doc.Save();
doc.UpdateViews();
if ( mode != 0 ) doc.SwitchViewMode( mode );

Now you come to the Windows Scripting Host objects. Our framework uses WSH to get the XMLSPY install directory, read and write files, and execute validation commands. There isn't anything very difficult about using WSH in XMLSPY's Scripting Environment. The most important part is just knowing what objects are available and how to get a reference to an instance of one.

We created a two-line "getSpyPath" function to return the XMLSPY install path. The files your command and schema mappings are stored in are directly under this directory.

var shell = new ActiveXObject("WScript.Shell");
var value = shell.RegRead
("HKEY_CURRENT_USER\\Software\\Altova
\\XML Spy\\Setup\\InstallationDirectory");

Once you have a way to address the files, you need a way to read and write them. You will do this with WSH's "Scripting.FileSystemObject". We ran into a snag at this point. It's easy to get a handle to a file using WSH, but we were unable to use the "exists" method we found in online documents. Instead of testing for existence we worked around this with a conservative file create call that fails when the file exists, as shown in Listing 3.

Reading files involves an additional step. Once you have a reference to the file, use the "OpenAsTextStream" method and iterate over the lines, as shown in the fragment below.

var f = fso.GetFile( doc.GetPathName() );
var s = f.OpenAsTextStream(1);
var chars = "";
while ( ! s.AtEndOfStream ) {
chars += s.ReadLine();
}

Finally we get to the heart of the matter - executing validation commands. Again, the WSH call is straightforward. The "Wscript.Shell" object's "Exec" method returns a process object with a standard out stream. Every validation command must report through this stream so we can collect the output and present it to the user.

Additionally, it made sense to allow validation commands to include two delimited variables: "schema" and "doc" for the path to the schema and document files, respectively. You already created the simple form that presents the validation report. That form breaks the "validationMessage" string at every ";" character for readability. All this comes together in the lines below.

var shell = new ActiveXObject("WScript.Shell");
var doc = Application.ActiveDocument;
actual_command = replace( actual_command, "schema",
actual_schema );
actual_command = replace( actual_command, "doc",
doc.GetPathName() );
var pipe = shell.Exec( actual_command );
while( ! pipe.StdOut.AtEndOfStream ) {
out += ";" + pipe.StdOut.ReadLine();
}
validationMessage += out;

Setting up the framework
There are a few quick configuration steps to take before you are done with the scripting side. First, since we had a number of small WSH issues interfacing with the file system we found it easier to create the validation command and schema text files by hand. The scripts you can download will create these files as written but they may also fail with a permissions error while in the process of writing the new file. (With more time we may be able to work around this issue, but it hardly seems worthwhile right now.) In the XMLSPY install directory, probably "c:\Program Files\Altova\xmlspy", create "validation-cmds.txt" and "validation-schema.txt".

Next, in XMLSPY, click on the Tools menu and select "Options". Flip to the Scripting tab and check "Activate scripting environment when XMLSPY starts", "Run auto-macros", and "Process events". Also, if you are using our scripts, check that "JavaScript" is selected. Finally, make sure that the "Global scripting project file" field points to the scripting project file containing your code. Clearly, you also need to set up utilities for your validation commands.

Now, if you're using our project file, the next time you start XMLSPY and open a document, your multipass validation commands will appear on the Tools menu thanks to a showMacros call in the On_OpenDocument event handler. Every other member of your work group just needs a copy of the project, the command and schema files, and to configure their XMLSPY scripting environment as you just did.

That ends the hard part. The rest of the scripting work will be self-explanatory as you look at what we did. Now for the fun part - using the framework for multipass validation.

Validation Command Code
Before you create validation rules you need a way to execute them. Our Schematron utility of choice doesn't take much setting up. A simple but complete zvonSchematron driver could be as trivial as this .bat file:

c:\zvon\AltovaXSLT.exe -xml %1 -xsl c:\zvon\zvon\bin\
zvonSchematron.xsl -out c:\zvon\schema.xsl
c:\zvon\AltovaXSLT.exe -xml %2 -xsl c:\zvon\
schema.xsl -out report.htm

in which the arguments are a Schematron schema and the path to the document to be validated. Although this worked fine with our framework, it output more command-line feedback than we liked. Since the first step generates an XSLT file from the Schematron schema, you probably don't want to perform that step each time you validate a file. Removing the step also eliminates some of the noise. Also, since the framework doesn't want HTML, and we didn't want to hack on zvonSchematron, run zvonSchematron on your schema once, and modify the resulting XSLT for simple line-by-line output. Your batch file then becomes even more basic:

c:\zvon\AltovaXSLT.exe -xml %1 -xsl %2 -out %3

At that point it makes more sense to just add that as the validation command in this form:

c:\zvon\AltovaXSLT.exe -xml ~doc~ -xsl ~schema~ -out
temp.txt | more temp.txt

of course, using your own command-line XSLT processor. Your XSLT file generated by zvonSchematron from your schema needs to be added to your collection of schemas using the framework. Both the "~schema~" and "~doc~" will be swapped out at execute time for the full path to your schema and to your current working file, respectively (see Figure 1).

Example XML Schemas
Now that we have everything ready and our macros showing on the Tools menu in XMLSPY, we need XML files to test against. As we said, the goal is to make it easy for schema developers to meet a set of workplace guidelines for XSD.

Our scenario is this: multiple developers are creating new document types from an existing framework of developed XSDs. We wish to ensure that these new schemas fit established patterns. To understand the value of the rules being enforced by the Schematron schema, we need to discuss the framework XSDs.

First, the two included schemas: the file "root.xsd" gives the base type definition to be used for defining all root-level elements (see Listing 4).

At Classwell we often create new document types. Among these types we need a certain amount of commonality. Instance documents will be processed by a single tool set and used in the same application framework (perhaps with some customization for new types, but with substantial baseline functionality). The team cannot afford rewrites for every document type.

We follow patterns that guarantee our system will continue to work with new content without modification. However, as our company grows more content, work is done simultaneously, more people create new schemas, and enforcement of patterns becomes more and more valuable. We need to avoid merely saying "you have to know this and remember to do that."

One approach is to leverage the XSD type derivation mechanism. However, there are limitations to what you can enforce this way, and XSD cannot actually enforce any particular usage - this is what our first rule is all about.

So the example requirements are simple: the root element must have an "id" attribute and start with a "meta" element containing all the central metadata we care about. (Here we leave the space empty for brevity.)

The second schema is "para.xsd". This document defines the low-level text to be used in instances, within the document type-specific structure. Again this is left blank for brevity, but it would be defined to allow the sort of semantic tagging used to support rendering different styles, links, and so on, within paragraphs (see Listing 5).

Finally, we have an example of a XSD defining a specific document type, whose root element derives from "rootType" in "root.xsd" (see Listing 6). This document will pass all five of the tests we set up below.

Now we need a Schematron schema to enforce our rules. Each "rule" has a context and one or more self-documenting positive or negative tests. Positive tests are wrapped in a "report" element, negative in an "assert". A separate "pattern" is required if two rules inhabit the same context.

  • Rule 1: Schema must include para and root schemas:

    <pattern name="Schema must include para and root schemas">
    <rule context="xs:schema">
    <assert test="xs:include[@schemaLocation=
    '../root.xsd']">'root.xsd' is not
    included.</assert>
    <assert test="xs:include[@schemaLocation=
    '../para.xsd']">'para.xsd' is not
    included.</assert>
    </rule>
    </pattern>

    Schematron rules are basically just XPath expressions. Context paths define where tests apply. Each test attribute holds a predicate expression. Placed on an "assert", if the test fails, the element's text obtains and should be displayed to the user. In this case, if the "xs:schema" element does not have a child "xs:include" element that has a "schemaLocation" attribute equal to "../root.xsd", the assertion fails and the error statement holds. The two assertions here check for the two framework XSD schema that must be included in any new XSD.

  • Rule 2: Schema must be versioned:

    <pattern name="versioned">
    <rule context="xs:schema">
    <assert test="@version]">Schema does not
    have a version.</assert>
    </rule>
    </pattern>

    This rule is even simpler than the first one. For conceptual clarity we used a new "pattern" and the same context - the root element of the XSD. The test does nothing more than assert the presence of a "version" attribute.

  • Rule 3: The first global element defined by a schema must be an extension of "rootType":

    <pattern name="extension ">
    <rule context="xs:schema/xs:element[1]">
    <assert test="complexType/complexContent/
    xs:extension/@base=
    'clg:rootType'">First element not of 'rootType'.</assert>
    </rule>
    </pattern>

    The "context" points to the children of the root, the global definitions, and narrows the path to the first child. This test describes a structure based on an anonymous "complexType" that derives by extension from the preferred "clg:rootType".

  • Rule 4: Avoid the use of "xs:string" (prefer "xs:token"):

    <pattern name="token">
    <rule context="//xs:element|//xs:attribute">
    <assert test="@type and @type='xs:string'
    ">Definition uses 'xs:string'--please
    use 'xs:token' instead.</assert>
    </rule>
    </pattern>

    To keep things simple, we made the context of the fourth rule any element in the XSD. The test specifies that no attribute's type equals "xs:string". In Classwell's actual practice the use of "xs:token" is not firmly required, but rather a suggestion of a strong group preference.

  • Rule 5: Do not redefine "locator" locally - use global definition only:

    <pattern name="locator">
    <rule context="xs:schema/*//xs:element">
    <report test="@name='locator'">Element
    'locator' should not be defined locally.
    Use global definition or choose a
    different name for this element.</report>
    </rule>
    </pattern>

    Our final rule uses the context to find all non-global elements based on their occurrence somewhere deeper than directly under the schema node. If such an "xs:element" node exists, test it to see if its name is "locator" and if so report an error.

    Breaking the rules
    To demonstrate these rules we made another schema that breaks all five of them (see Listing 7).

    With the rules in place we are ready to test them against an XSD. The steps to take are as follows:
    1.   Apply "zvonSchematron.xsl" to "rules.schema" (our Schematron file) using Altova's XSLT Engine (or another). We named our output file "temp.xsl".
    2.   Modify "temp.xsl" to remove HTML formatting. You can avoid this step by removing the code in zvonSchematron that generates presentation style, but as we said earlier, we choose to keep it simple for this short article.
    3.   Use the "Setup Schema" macro on XMLSPY's Tools menu to add "temp.xsl" as a named "schema" available for validation within the framework (since "temp.xsl" is an XSLT file this is a broad definition of schema).
    4.   Use the "Setup Command" macro on XMLSPY's Tools menu to add something like: "c:\zvon\AltovaXSLT.exe -xml ~doc~ -xsl ~schema~ -out temp.txt | more temp.txt"
    5.   Use the "Add Validation" macro on XMLSPY's Tools menu to add a validation step pairing the zvonSchematron command with "temp.xsl" to the test document.
    6.   Use the "Validate" macro on XMLSPY's Tools menu to run a test validation.

    In combination with another validation command applying XMLSPY's native XSD validator, this otherwise valid XSD is caught in five infractions of the work group rules and therefore fails to be valid in that context, despite its "technical" correctness.

    Conclusion
    The development example outlined here offers a practical alternative to a development team practicing ad hoc schema design guideline monitoring. Moreover, for those who need the extra descriptive power, this general approach to validation provides additional flexibility within a versatile and well-known development environment. Although XSD give the majority of XML developers more than enough power, given the complexity and scope of the XML world, exploring options like those described here is a valuable exercise for serious developers.

    References

  • Schema language comparison: http://nwalsh.com/xml2001/schematownhall/ slides/foilgrp03.html
  • Schematron: www.ascc.net/xml/resource/schematron/ schematron.html
  • ZVON: www.zvon.org
  • Altova: www.altova.com
    Published Aug. 6, 2003— Reads 13,634
    Copyright © 2003 SYS-CON Media, Inc. — All Rights Reserved.
    Syndicated stories and blog feeds, all rights reserved by the author.
    Related Links
    ▪ Figure 1
    ▪ Source Code
    About David Kershaw
    David Kershaw is the professional services manager at Altova, Inc., the XMLSPY company. David brings 11 years of software engineering, project, and product management to Altova. His previous positions include serving as the Director of Engineering at Classwell Learning Group and the Group Director of Engineering at Organic, Inc. David received his master's degree from Harvard University and his bachelor's degree from the University of Massachusetts.

    About Eric Schwarzenbach
    After working in software development in various industries over the past decade, Eric Schwarzenbach eventually came to specialize in electronic publishing. The issues of document-oriented XML are his focus: document and knowledge modeling, processing, and management frameworks, as well as XML databases and searching. He's written custom document management systems and worked with native XML databases such as Tamino. He currently serves as the unofficial Tamino XDBA and content engineering lead at Classwell Learning Group.

  • Add Your Feedback

    In order to post a comment you need to be registered and logged in.

    Register | Sign-in

    Reader Feedback: Page 1 of 1

    Subscribe to the World's Most Powerful Newsletters
    Subscribe to Our Rss Feeds & Get Your SYS-CON News Live!
    Click to Add our RSS Feeds to the Service of Your Choice:
    Google Reader or Homepage Add to My Yahoo! Subscribe with Bloglines Subscribe in NewsGator Online
    myFeedster Add to My AOL Subscribe in Rojo Add 'Hugg' to Newsburst from CNET News.com Kinja Digest View Additional SYS-CON Feeds
    Publish Your Article! Please send it to editorial(at)sys-con.com!

    Advertise on this site! Contact advertising(at)sys-con.com! 201 802-3021

    SYS-CON Featured Whitepapers

    ADS BY GOOGLE

    Breaking Java News
    Strong4Life Addresses Growing Number of Children Who Likely Will Become Overweight or Obese Adults
    Vince Offer's Schticky(TM) Tops Jordan Whitney List at #5
    Marchex Reports Fourth Quarter 2011 Financial Results
    Calibre Completes Phase II Drilling at the 100% Owned Riscos de Oro Gold-Silver Project, Nicaragua
    Atlantic Tele-Network Sets Date To Report Fourth Quarter Results
    Allied Properties Real Estate Investment Trust Announces February Distribution
    Registration Opens for 24 Hours of PASS, Featuring Closed Captioning in 15 Languages
    Spectrum Provisions Bring Consumer Benefits, New Investments, Much-Needed Public Safety Tools
    Kennametal Management Meeting with Financial Community in Boston, New York and Chicago

    ADVERTISE   |   MAGAZINE SUBSCRIPTIONS   |   FREE BREAKING-NEWSLETTERS!   |   SYS-CON.TV   |   BLOG-N-PLAY!   |   WEBCAST   |   EDUCATION   |   RESEARCH

    .NET Developer's Journal - .NETDJ   |   ColdFusion Developer's Journal - CFDJ   |   Eclipse Developer's Journal - EDJ   |   Enterprise Open Source Magazine - EOS
    Open Web Developer's Journal - OPENWEB   |   iPhone Developer's Journal - iPHONE   |   Virtualization - Virtualization   |   Java Developer's Journal - JDJ   |   Linux.SYS-CON.com
    PowerBuilder Developer's Journal - PBDJ   |   SEO / SEM Journal - SJ   |   SOAWorld Magazine - SOAWM   |   IT Solutions Guide - ITSG   |   Symbian Developer's Journal - SDJ
    WebLogic Developer's Journal - WLDJ   |   WebSphere Journal - WJ   |   Wireless Business & Technology - WBT   |   XML-Journal - XMLJ   |   Internet Video - iTV
    Flex Developer's Journal - Flex   |   AJAXWorld Magazine - AWM   |   Silverlight Developer's Journal - SLDJ   |   PHP.SYS-CON.com   |   Web 2.0 Journal - WEB2
    Apache   |   CMS   |   CRM   |   HP   |   Oracle Journal   |   Perl   |   Python   |   Red Hat   |   Ruby on Rails   |   SAP   |   SaaS

    SYS-CON MEDIA:   ABOUT US   |   CONTACT US   |   COMPANY NEWS   |   CAREERS   |   SITE MAP
    SYS-CON EVENTS:   |  AJAXWorld Conference & Expo  |  iPhone Developer Summit  |  Cloud Computing Conference & Expo  |  SOA World Conference & Expo  |  Virtualization Conference & Expo
    INTERNATIONAL SITES:   India  |  U.K.  |  Canada  |  Germany  |  France  |  Australia  |  Italy  |  Spain  |  Netherlands  |  Brazil  |  Belgium
     Terms of Use & Our Privacy Statement     About Newsfeeds / Video Feeds
    Copyright ©1994-2008 SYS-CON Publications, Inc. All Rights Reserved. All marks are trademarks of SYS-CON Media.
    Reproduction in whole or in part in any form or medium without express written permission of SYS-CON Publications, Inc. is prohibited.
     
    close this window