Richard Davies wrote: The UK has a good crop of technology pioneers in cloud computing - for example ElasticHosts, FlexiScale, Flexiant, OnApp - and also some strong government initiatives such as G-Cloud.
We will have to see whether this kind of technical leadership converts into swift mass-market adoption or not.
What could be better for managing content than separating data from
presentation? How about separating data from data? Believe it or not,
XsLT can actually be used to allow for different levels of data
abstraction. In practical terms, this can reduce the complexity of
managing Web content by an order of magnitude and facilitate code
reuse. In essence, what I'm talking about here is object-oriented
XsLT, or to add to the alphabet soup: OOX (pronounced ooks).
Why Use OOX?
Isolating content from presentation was the original purpose
of stylesheet languages. In the conventional approach, there is just
one data layer (XML) and one presentation layer (HTML), with XsL
transformations (XsLT) in between. This two-layer architecture
simplifies Web site management by allowing content providers to edit
their data without concern for stylistic issues, and, conversely, by
permitting graphics designers to set the visual tone without regard
for specific content. While the two-layer model has been fruitful,
XsL transformations (XsLT) empower us to extend data abstraction
through the use of multiple data layers. Toward this end, I have
created a general-purpose XsLT that you can easily use to apply
multiple serial XsL transformations to an XML data document. But
before we delve into the details, let's consider why this might be
useful.
Multiple data layers
suppose you're developing a corporate Web site in which there
are three types of documents that you wish to make available to your
site visitors: white papers, press releases, and external reviews.
Each document is stored as XML data in its own distinct format
according to its respective schema, which defines its structure. For
example, a press release usually contains a headline, release date,
location, and contact information, whereas an external review
document might contain a title, author, publication date, and
hyperlink.
Despite the structural differences in these documents, you
may see the utility in making their content available to site
visitors in similar ways. For example, on the home page you might
want to display headlines for the five most recent press releases as
well as titles for the five most recent white papers. so even though
press releases and white papers have different data structures, your
goal is to present their header content in the same way.
Furthermore, your site visitors may wish to read a summary
before embarking upon a full-length white paper or external review.
Therefore, you will probably want to give them the ability to view
either a white paper abstract or the synopsis of an external review.
Here again we see the need to use the same presentation format for
similar content elements that are embedded in structurally dissimilar
data documents.
The old way
Using the canonical, two-layer approach, you would need to
create one XsL transformation to convert each data document into each
presentation format. In our example, we have discussed three data
documents and two presentation formats. Therefore, you would need six
(three times two) distinct XsL transformations (see Figure 1).
The new way: OOX
Using data abstraction, however, you can add an intermediate
data layer to your architecture. The structure of data objects
occurring in this layer might be defined by a "generic" schema,
suitable for representing content from all text-based corporate
documents. With the addition of this layer, the characteristics that
normally distinguish different corporate documents are transparent to
the presentation transformations. Each presentation transformation
only needs to be aware of the data structure for the new, generic
documents. Thus, the schema for the new layer of data objects becomes
a single interface for the presentation transformations. With this
approach, you would need only five distinct XsL transformations (see
Figure 2).
The key idea here is that if you can first transform each
data document (e.g., press release, white paper, or external review)
into a general-purpose corporate document, then you only have to
build your presentation transformations for the more general data
structure. Furthermore, if you were to add a new type of document in
the future, such as "product descriptions," you wouldn't have to
create a new XsLT for each presentation format. Instead, you would
need only a single new transformation that converts product
descriptions into the general-purpose document. This potential reuse
of the two existing presentation transformations by leveraging the
middle layer schema as an interface illustrates a fundamental
advantage of employing an OOX methodology.
Efficiency
In this example, shifting from a two-layer to a three-layer
architecture reduced the number of necessary transformations by only
one. However, it is easy to imagine the combinatorial explosion that
might occur with a more complex architecture. By merely expanding the
original specification to include five document types and five
presentation formats, the two-layer architecture would call for 25
distinct transformations. In contrast, the three-layer model would
only require 10 transformations. Higher-order abstraction could
provide even greater economy.
Flexibility
This three-layer approach, however, is not suitable for all
situations. The reason XML data objects conform to different schemas
in the first place is that they represent different kinds of
information. Press releases adhere to a different structure than
external reviews because they consist of intrinsically different
content. Insofar as press releases and external reviews are similar,
the construction of a general-purpose middle-layer document is
useful. But there will certainly be times when it is necessary to
present content that is unique to a specific document type. In such
cases, the traditional approach of direct transformation applies.
Fortunately, in OOX both approaches can coexist peacefully.
Modularity
The advantages of OOX are not limited to just economy of
representation. Incorporating multiple layers of data abstraction
permits architectural modularity, which leads to ease of code
maintenance. One of the key advantages of object-oriented programming
(OOP), when implemented properly, is the reduction of dependencies
among elements that interact with each other. Due to its modularity,
this same advantage applies to OOX. In the context of our corporate
documents example, if you decided to change the presentation of the
header lists in the traditional model (see Figure 1), you would have
to modify each of the three presentation transformations that were
created to display those lists. In the OOX model (Figure 2), however,
there is only one transformation associated with presenting a list of
header content. Therefore, all of the necessary modifications could
be made in one place.
How Does It Work?
"OK, I'm convinced that OOX is the greatest thing since
Nutella, but how does it work?"
The key to implementing a multilayer data architecture, and
hence OOX, is to be able to process an XML data document serially
through two or more XSL transformations. In our corporate document
example, to present the list of press release headlines we need to
perform two transformations. First, we must transform the press
releases document into a generic corporate document, which will
constitute the middle layer of data. Then, we need to perform a
second transformation to render the header content from the generic
document as XHTML in a browser. This sounds simple enough, but the
central issue becomes figuring out how to perform both
transformations in a single step.
As I became more interested in pursuing an OOX approach for
my own Web development, I was surprised to discover that I couldn't
find a direct method for performing multiple transformations on an
XML data object. Since I had no desire to become an XSL contortionist
for each new Web project, I set out to develop a usable tool for
performing multiple serial transformations in a single step. Ideally,
I wanted a pure XSL solution that would blend seamlessly into an XML
content management framework.
The end result of this effort was the creation of an original
XSL transformation that makes it possible to invoke multiple serial
transformations by passing a single XML document to the browser. In
essence, this new transformation is a controller, which operates
against an XML script. The script specifies the source content
document and the various transformations it should undergo. Thus, the
controller takes the source document specified by the script and
iteratively applies each transformation listed in the script (see
Figure 3). The controller XSLT and all of the files required for
reproducing the examples in this article are available at
www.sys-con.com/xml/sourcec.cfm.
Using the Controller
As you may recall from our example, in order to render press
release headlines in a browser, we need to process the press releases
source document through two transformations. This is easy with the
assistance of the controller. All we have to do is create an XML
script that specifies the source data and transformations. This is
accomplished using the tags <source> and <filter> respectively.
I decided to use the "filter" tag to refer to each individual
transformation, and the <transformation> tag to refer comprehensively
to the series of transformations that the source file undergoes.
Thus, the <transformation> tag is used only once, much like the
<body> tag in an HTML document. It is also worthwhile to note that
the controller is actually the script's stylesheet. If the script did
not include a reference to the controller, the browser would not know
how to process the script.
Step-by-step: the first transformation
When the browser's XML parser applies the controller to the
script, it performs each transformation consecutively in memory,
using the output from one as the input to the next. Ultimately, the
output from the final transformation is rendered in the browser. In
our example, the first transformation converts the source document of
press releases into the generic format. However, because this is not
the final transformation, we are not privy to its output. So just to
verify what is happening behind the scenes, I have invoked our first
transformation directly from the source document. The output from
this transformation is shown in Listing 1. Thanks to the controller,
we normally would not see this generic data document, but it does
occur in memory.
Step-by-step: the second transformation
Once the first transformation has taken place, the controller
uses the resultant document (i.e., Listing 1) as an input to the
second transformation. Since we are using a total of only two
consecutive transformations, the browser renders the output from the
second transformation, which is a formatted list of press release
headlines (see Figure 4).
We can write a comparable script for external reviews, which
will produce a similar result.
Note that the second transformation, generic_to_list, is the same for
both of our scripts. This is because after we have transformed the
original documents into a generic document, we can use the same
transformation in both cases to render a formatted list of header
content.
The important thing to remember in implementing OOX is that
the scripts, which are XML data documents themselves, are used to
initiate a process that ultimately produces an output in the browser
window. In other words, if you want to present content in the
browser, you need to reference the script that leads to the desired
output. This can be confusing at first because most of us are
accustomed to referencing content documents directly. Typically, if
we want to display a list of items stored in an XML data document, we
reference that document's URI directly. The data is then presented
according to the document's stylesheet. In OOX, however, we take the
indirect approach of referencing a script that details how the source
content will be massaged before it is finally rendered.
'Hello World'
Now that we've walked through an example at the conceptual
level, you are probably eager to get your hands dirty and do some
coding. As with HTML, you don't need any special development
environment to get started. A simple text editor like Notepad is
sufficient, although a more sophisticated tool, such as Macromedia's
HomeSite, makes it easy to go back and forth between editing and
testing. For a simple proof of concept, you will need a content file
(XML), two transformations (XSL), the controller (XSL), and a script
(XML). Once you've created these documents, you can render your
content by entering the path of the script file in your browser. I
encourage you to look at the example code, which has been tested in
IE 6.0. Sometimes the quickest way to get started using a new
technology is to take a working example and gradually modify it to
suit your own needs.
Tip: In developing data-to-data transformations, it is a good
idea to check the outputs at intermediate stages (as shown in Listing
1) to make sure that the abstract layers exhibit the intended data
structure. Microsoft makes available a tool that allows you to see
the result of an XSL transformation as XML formatted data. Once you
install this tool, you can right-click on the document in Internet
Explorer and select "View XSL Output". However, this will only work
if the XML file you are viewing in the browser specifies an XSL
stylesheet. That means that even though the controller doesn't
require it, you will need to include a line in your source document
such as
Using OOX to Manage Content Upgrading existing code
When it comes to Web development, I tend to learn things on a
need-to-know basis. There's simply too much innovation these days to
keep track of all the emerging technologies. So if you're like me,
before you invest a lot of time in using OOX you will want a clear
sense of how easily you can incorporate it into your work and how
useful it will be.
The easiest way to start applying OOX is to retrofit an
existing Web site. Here the goal is not to alter the functionality of
the site, but to make its content more manageable. The key is either
to identify content elements that are presented in similar ways (as
in the corporate documents example) or conversely, to recognize
presentation elements that draw upon related content. To illustrate
the latter case, consider the following. Many Web sites present a
main menu horizontally at the top of the page, along with a submenu
in the sidebar, which depends upon the main menu selection.
Furthermore, these Web sites sometimes make available a site map to
visitors as an alternate method of locating content. The site map
typically provides a hierarchical list of hyperlinks for accessing
pages directly. Since the menu elements and site map present related
content, this Web layout can benefit from multilayer data abstraction.
Unlike the corporate document example in which we transformed
specific documents into a generic format, in this example we begin
with a generic data document (see Listing 2), which contains
hierarchical information, and use transformations to achieve greater
specificity. Since the site map is based upon the complete hierarchy
of Web pages, we can directly transform the generic document into a
site map.
The menu system, on the other hand, corresponds to only a
subset of the Web site's page hierarchy. Specifically, the main menu
corresponds to the first level of Web pages, and the submenu
corresponds to the second hierarchical level. Therefore, rather than
extracting those elements directly from the generic document and
presenting the menus using single transformations, we can separate
the process into two stages. The first transformation is used to
create a new data document containing only the hierarchical
information relevant to the menu system, that is, the first two
levels.
This menu data document can then undergo one or more second-tier
transformations to present the appropriate menu and submenu as needed.
The OOX architecture for this site map and menu
implementation is depicted in Figure 5. In examining this
architecture, you may notice the modularity inherent to this
approach. The use of a middle layer for representing the menu data
permits the reuse of the menu and submenu transformations in the
context of other Web projects. To reuse those transformations for a
Web site that doesn't have a site map you could revert to a two-layer
design and create the menu data document manually.
In addition to gaining reusable transformation objects,
another advantage of this implementation is that the hierarchy
document becomes a one-stop shop for modifying the site map and menu
systems. Adding a new submenu item is accomplished with a single
entry in the hierarchy document. Once this new item has been
inserted, the site map and the menu system will automatically reflect
the new content.
As you have seen, retrofitting an existing Web site with an
OOX architecture is based upon identifying related content groups
(e.g., corporate documents) or presentation groups (e.g., the menus
and site map). Each group defines a potential opportunity for
improving content management. Therefore, with a little planning, the
task of overhauling a Web site with OOX can be an incremental and,
hence, manageable process.
This gradual process of implementing architectural upgrades,
such as those involving OOX, is called refactoring. specifically,
refactoring refers to making internal design changes that facilitate
understanding and working with code without materially affecting its
functionality. Martin Fowler introduced this term in his 1999 book,
Refactoring: Improving the Design of Existing Code. It is sometimes
difficult to justify refactoring when there is a competing need for
new content or functionality. However, the piecemeal nature of
refactoring with OOX makes it possible to enhance a Web site's
architecture without disrupting ongoing development. In fact, there
are compelling reasons for allocating resources to refactoring. For
example, if you do want to add new functionality to a Web site,
refactoring first can save time later by making the site more
extensible. Furthermore, a timesaving benefit specific to Web sites
(as opposed to software) is streamlined content management.
Starting a new project
If you are embarking upon a new Web project, the OOX
framework can be instrumental in the design phase. This is because
OOX unifies the goals of developing an architecture and preparing for
content management. By taking this approach, outlined in Figure 6,
you will ensure that your Web site is modular, extensible, and easily
maintained. Begin by identifying the content and presentation domains
for your Web site. In other words, make a list of the content you
will be providing, and then make another list of the various ways in
which you will present that content to your site visitors. Ideally,
the presentation list would be based upon a usability analysis. The
next step, which may sound familiar by now, is to identify
relationships within each domain so you can categorize items into
groups. These groups form the basis for defining intermediate data
layers. Finally, connect the content domain to the presentation
domain using the intermediate data layers and appropriate XsL
transformations.
There will be many plausible architectural configurations to
choose from in most OOX-based endeavors. The key to a good design is
parsimony. In other words, less is more. Be selective about adding
data layers by making sure that all of your design decisions are
based upon achieving a specific benefit, such as content management,
reuse, or extensibility.
Once you've built your Web site using this approach, managing
content is simply a matter of editing the data documents in the top
layer. The controller transformation and script files will ensure
that any content changes made to the top layer will trickle down to
the presentation layer.
OOX: More Than Just a Bag of Tricks
OOX, like OOP, isn't just about stringing together multiple
transformations, using extra data layers, or treating schemas like
interfaces. It's an approach to content management and Web
architecture that involves the judicious application of data
abstraction and the reuse of transformation objects. When applied
strategically, OOX can result in a low-maintenance Web site that is
quickly built, logically organized, and robust to structural content
changes.
As you are likely aware, there are feature-rich software
tools on the market to facilitate Web development and content
management. Many of these tools function by storing proprietary
metadata, which describe both structural and thematic aspects of the
Web site. For example, metadata might be used to programmatically
maintain navigation links on all pages of a Web site. These metadata
are not directly accessible to the Web developer, so even though the
software uses them internally for content management, they may impede
fine-level control. Furthermore, migrating from one of these content
management tools to another can prove vexing because the tools often
do not recognize each other's metadata.
In contrast to most content management tools, OOX relies
exclusively upon W3C-based technologies. Therefore, in adopting OOX
as a Web development paradigm, it is possible to exercise complete
control over your Web site without getting locked into proprietary
technology. Furthermore, flexible tools can work in concert with OOX
development.
OOX may not be suitable for all developers. But if you have
dabbled in XML and aren't afraid to explore the power afforded by
XsLT, you might be surprised at what the latest addition to alphabet
soup has to offer for content management.
Your idea of "Visual XSL++" is compelling if not ambitious, although I suspect the devil will be in the details.
Writing XSL transformations can certainly be challenging. That is why any approach that facilitates code reuse and extension would be beneficial. The OOX article describes how to facilitate reuse and minimize the number of unique transformations that are necessary.
But I think you are speculating about actual XSLT code inheritance, perhaps using a class structure like OOP languages. This would be interesting. Good luck with "the next step"!
Sincerely,
Pietro Michelucci
#7
Pietro Michelucci commented on 21 Aug 2003
Dear Mr. Kuehne:
I'm pleased to hear that you have had success in using an approach that is similar to the one presented here.
One thing that sets this approach apart from packages like Cocoon is that it based entirely upon native W3C technologies. It doesn't require the use of Apache modules, servlets, or any other add-on software. It's as simple as writing HTML, yet it embodies powerful capabilities. Thus, the object-oriented characteristics of this approach are completely manifested within the context of native XSLT and XPATH technologies.
Sincerely,
Pietro Michelucci
#6
Pietro Michelucci commented on 21 Aug 2003
Dear Pierre:
I'm sorry that you didn't find the article useful or the title appropriate. The aim here was to put forth a usable native XSLT tool to facilitate CM in the context of an object-oriented approach.
Insofar as "Object-Oriented" refers to modularity, code reuse, and inheritance, the label "Object-Oriented XSLT" seems to capture succinctly the paradigm that is being advanced here.
OOX is not presented as a panacea. However, the hope is that this article will encourage readers to pursue the use of XSLT in the context of a structured approach to CM that is easy to implement and extend.
Sincerely,
Pietro Michelucci
#5
John Coe commented on 21 Aug 2003
The last page is listed as 15 of 17, did I read it all?
The thing I would like to see (or write if I could) is a set of xsl transforms that convert a xsl file into a graphical representation (say in SVG) that can be changed (by drag actions) then transformed back to xsl.
It would be worth the effort to make these transforms in xsl, to make some real web based development that poduces xslt from a flow chart. Writing complex transforms directly in xsl can make your head hurt! what is the next step xsl++?
Ja, everthing's true. But this approach is implemented and successfully used in production for years !
#3
Pierre commented on 21 Aug 2003
I don't really understand the purpose of this article. What is so revolutionary about this way of doing ? Every flexible XML based CM system with an XSLT presentation layer works like that.
If the purpose of the article is to present a good way of architecturing presentation transformation pipelines than why entitle it "Object Oriented XSLT" ? You and I both knew this title is a pure eye catcher and really has nothing to do with the content.
I must this is more and more the case with sys-con's article and it's really a shame.
#2
Michael Mayne commented on 7 Aug 2003
Hi,
Having read the thing properly, I've now found listing 1!
I'm going to download the sample code and have a play - then I'm sure it will all become clear.
Thanks
Michael
#1
Michael Mayne commented on 7 Aug 2003
Hi,
The ideas in the article are very interesting, even though I'm really only just starting out with XSLT. I have a strong OO architecture/design/development (in Delphi and others) background so I can understand the concepts, but the nuts and bolts are a bit unclear. This isn't helped by the typographical errors:
- I can't see a listing 1
- Figure 5 is actually figure 2.
I'm also missing the understanding of why the following is significant:
But that's just my lack of knowledge at the moment.
Overall, I think the idea is great, but I'm struggling to understand the detail and how you can abstract different content into generic content for generic processing.
Regards
Michael
Subscribe to the World's Most Powerful Newsletters
Subscribe to Our Rss Feeds & Get Your SYS-CON News Live!
Click to Add our RSS Feeds to the Service of Your Choice: