Richard Davies wrote: The UK has a good crop of technology pioneers in cloud computing - for example ElasticHosts, FlexiScale, Flexiant, OnApp - and also some strong government initiatives such as G-Cloud.
We will have to see whether this kind of technical leadership converts into swift mass-market adoption or not.
At the heart of any software program lies data, and in the
case of Web services and service-oriented solutions this is presented
to the underlying software as XML documents. The representation and
handling of these documents within the software is a major challenge
with traditional development approaches, and often leads to
complicated collections of software programs interfering with the
clear flow of business information through an application. This
article will show that a new approach is essential in order to
simplify the complexity of today's software solutions while
significantly reducing the cost of inevitable future change.
Changing Environments
Presentation, exchange of data (both in terms of messaging
and logic), and the storage of that data are the core requirements
for most applications. For most people, XML is first used in the
presentation layer because this means that an application can produce
many different types of output from the same incoming data using
transformations. Web browsers, mobile devices, and PDF file readers
can all have output generated from one incoming document, and this
means that many applications now produce their output in XML format.
Over time, most people's use of XML spreads throughout the
application's layers. It may be used to pass messages around between
applications, technologies, or businesses. EAI projects use XML to
pass data as messages between systems and middleware technologies.
RSS feeds mean that XML data is used to syndicate content across
multiple systems. Open standards for business transactions such as
Origo, ebXML, RosettaNet, and so on mean that people are moving away
from fixed EDI-type data into this new, extensible format for data
exchange (e.g., The Accredited Standards Committee [ASCI] X12), which
can halve costs.
This has, for some, ultimately led to XML being used as a
storage medium for information. This may be as configuration files to
aid in control of the underlying business logic, or persistent
storage within native XML databases, or message tracking through
messaging middleware. In some cases, the business logic of the
application itself is contained within XML files (see Figure 1).
The inherent strengths of XML have ensured that the data is
easier to understand because it can now contain the semantic
information that describes it, and the extensibility also allows
applications to respond more effectively to the changing needs of the
business environment. XML has permeated through all of the
application's layers, but because the growth of XML usage was
evolutionary, many existing systems do not directly process XML data.
Challenges of Processing XML Documents
Traditionally, the main challenge when working with XML is
that, at some point, the document must be converted into structures
in code, but the business processes themselves are described
declaratively in terms of the business document. The business
analysts and architects work at the business document and business
rules level, but the programmers need to work with the data within
the software code in a different manner because the underlying code
base does not understand the documents. This separation between the
document and the underlying representation in the code means that
there is often a need to provide a mapping of some sort between the
two layers. This creates the classic disconnect between the business
needs and the programmer's coded implementation (see Figure 2).
To make use of an incoming XML document in software code, we
have to convert this document into code objects so that we can
interrogate it. In the case of an XML document, there are two options
available: represent the document as a document (using a document
object model such as DOM, for example), or represent the data
contained in the document as business objects in the code. This
article will discuss the differences between these two approaches and
will discuss how a new document-oriented approach can help to deliver
far greater flexibility and agility.
Mapping Data into Code Structures
Many automated binding solutions take an XML Schema and
generate classes from this structure. This has the unfortunate side
effect of tying the code tightly to the incoming XML structure -
should the structure of this document or the organization of data
within this document change in any way, the code would now not be
able to manipulate the new version without some potentially
considerable alterations. If the incoming data were invalid (in terms
of structure or organization) the code may not be able to deal with
this at all.
The common solution to this problem lies in the form of an
intermediate binding layer. These bindings sit between the incoming
data and the underlying representation of this data in the code and
act as a translation layer between the two. This abstracts the
incoming data away from the underlying code, gaining an obvious
advantage in that the underlying processing is now no longer directly
tied to the incoming data.
While this approach is a useful one, it does mean that we're
now tied to the underlying data representation in the code. Should
the data change (perhaps because the application needs to accept a
different XML standard or structure), the underlying software model
would need to be changed to deal with this. This might mean that
processing is now duplicated across several classes because the data
they're processing is different. Even if the data can be mapped,
however, we still need to code a new binding layer for this new
incoming data structure. As more partners start to use this service,
the underlying code will become more and more difficult to maintain.
Reuse of code also becomes an issue in such cases. When the
underlying logic is spread and duplicated across the application, it
becomes very difficult to see areas where the code can be refactored,
or where logic can be reused. This leads to problems in future
development and maintenance of the application. This is particularly
apparent in highly complex, time-critical applications, because a
considerable amount of knowledge is required to understand where the
business logic is located, what is actually happening within a given
business process, and how that impacts the business documents flowing
around.
The binding approach raises other problems because of the
syntax of XML. Dealing with namespaces and complex, recursive XML
structures can be difficult, and some binding solutions have
difficulties in dealing with these. Binding solutions may also lose
information contained within the structure of the incoming XML, but
even if they don't there may still be problems when marshalling data
types into the underlying code, or unmarshalling them back out again
afterwards. Consider the differences between the sizes and formats of
primitives (such as integers, decimals, arrays, and so on) across
different languages and platforms, and mapping these to one another
via XML structures may not be possible at all, or might require
compromises with far-reaching consequences.
Separation of Logic from Data
Another major disadvantage of converting the document into
code structures is that it also separates the logic from the incoming
business document context. The program will not act on the data
directly - instead, it will be converted into another representation,
and at that point it becomes harder to see what has happened to it
because it is no longer a human-readable document, but is instead
represented as binary data within code. It is now also much harder to
debug, because the information must be extracted from the code - it
is no longer clearly visible.
As we can see, binding data into code can present a number of
challenges. What is needed is a new approach in which evolution and
interoperability are simple and easy. This new approach must combine
visibility with agility, and must be able to deal with multiple
versions of incoming data without the need to reimplement much of the
existing business logic (see Figure 3).
Operating Directly on XML Documents
In the real world, documents are used to pass data around -
forms, notes, memoranda, and e-mails are all used to convey
information and data to the recipients. As these documents are passed
around, they are often altered, manipulated, copied, or changed into
another format. This "paper trail" of processing is easily visible,
and the process can be adapted to adjust to new data and new
information quickly and easily.
Bearing that in mind, it appears that keeping our data in the
form of documents (or a DOM document object in the code itself) will
solve many of our problems. First of all, abstracting the document
into a different code structure isn't actually necessary - in fact,
it may well be an unnecessary conversion stage. Instead, we can work
directly on the document itself by using the open W3C XPath syntax to
query the document so that we can identify or query content within
that document, and we can then add, remove, or change elements of the
document based on the results of these queries using our software
language of choice. The main advantage of this approach is that we
always have a document available for inspection, making debugging the
Web service or application much easier. It also allows for a clear
audit trail, which may well be necessary in certain situations.
Another advantage is that it parallels the real-world model
of the process in such a way that the business processes are more
visible to business stakeholders. More readily understood
representations of business processes ensure that future maintenance
and development are simplified because the underlying processes are
easily visible. The logic acts directly upon the document data, and
because there is no mapping (or, at worst, a one-to-one mapping in
the case of a DOM object) of the data into collections of code
structures, the underlying logic is much easier to see.
This approach also ensures true standards compliance. XML and
XPath are open standards, freely available and widely used. Operating
directly on XML documents ensures compliance at every stage of
development. This approach ensures portability because the
application's logic is no longer dependent on the software code
language. Passing XML documents around also ensures that
interoperability is no longer an issue - if the services we use can
accept XML, there is no interoperability problem. All we need to do
is to send the document to a different location.
Working on the document directly also ensures that, at every
stage, we have access to the information. This mirrors the real-life
situation in which a document is passed from person to person. This
ensures that debugging the application will be far easier than trying
to extract information contained in data models in the underlying
code. Each process does something to the document, so by looking at
the documents before and after each step we can see exactly what has
or has not been done to that document.
Document-Driven Processing
In terms of code, this approach suggests that at each point
we need to think in terms of the function of the code as it relates
to the manipulation of a document: In other words, what will happen
to the document as it passes through this code? The easiest way to
demonstrate this is by thinking in terms of a real-life purchase
order scenario for a retailer organization that uses external
suppliers for product fulfillment. First, the incoming purchase order
document is checked over to make sure it has the correct
authorization and that the expenditure is within given ranges -
validation. This may be countersigned or stamped to say that it's
been checked before it's passed to someone else to handle the next
stage of the process (i.e., the ordering process). This new person
will fill out an order form for the company selling the product using
the content of the first document to populate the relevant fields to
the new document. This document will then be given to the company
that supplies the goods, and they will in turn return a new document
in the form of a receipt after it's been processed.
When modeling out a document-based application, it makes
business sense to think of these processes being performed by someone
or something because this mirrors the real-world process in a way
that's clear and easy to understand for both business and development
users. Each individual step can be broken down into a series of
entities performing business tasks on the document, and the term most
often used to describe this entity is an "agent."
Checking the Content
Once the information has reached the agent, it must be
checked to ensure that it can be used appropriately by the agent for
the business task that it provides or delivers. With XML documents,
there are several ways that this content can be checked. The first
(and most obvious) of these is by schema validation. This can ensure
that the incoming document is syntactically valid. This means that
incoming data is processed on a best-effort basis, and that the
incoming document can be rejected if it does not contain the expected
data.
If the application reacts to the data in the incoming
document, then the application can be extended iteratively by adding
processing that reacts to the product of the previous results. The
application might initially begin by validating the content in some
way, and returning a message to let the user know if the data is
valid or not. To extend this processing we now only need to react to
this new piece of data. This ensures that each stage of processing is
clearly separated and can be added iteratively to make development
and maintenance much simpler, as well as allowing for easy
maintenance and extensions to functionality in the future.
Document Manipulation
Working directly with documents means that operations must be
defined to make alterations to them. Ideally, these processing
actions will mirror the actions that a human operative would make
when processing a document manually - store it, read some information
from it, write to it, send it somewhere else, transform it into a new
document for another business service to use, and so on. This makes
it easier to see what is happening throughout the process, and
debugging is simplified by the ability to save the document out at
each processing stage to ensure that the processing is performed
correctly. This approach also allows each agent to be tested and
debugged individually, and for the resultant application to be
constructed in a modular and iterative manner promoting reuse and
improving maintainability.
Of course, these maintainability issues also apply to the
incoming data. What happens if the incoming data structures are
altered, perhaps by the need to incorporate a new standard for a
business document into the application? By making use of XML
documents we can ensure that the processing is flexible, easily
extensible, and above all else maintainable. By handling XML
documents directly and acting on their content, converting an
incoming document into the expected structure for processing is as
simple as adding a new rule to transform the incoming data when it
matches the new structure. Without this new rule, the new incoming
document will simply not be processed by the application, making the
application more robust and resilient.
Conclusion
We've looked at using XML documents directly when developing
business logic. This document-oriented development approach is
ideally suited to Web service and service-oriented solutions, and
also provides massive benefits in maintainability. Making use of
documents in software echoes the human, document-based world in which
most business processes exist, and by echoing this we can leverage
the understanding that we already have of these processes. This will
simplify both the construction of new applications and the
maintenance of existing applications constructed using this model.
Business stakeholders understand business documents but often
don't understand object models, UML, relational database models, and
so on. Business and IT development can now converse based on the same
business models, and this will deliver far greater collaborative
benefits and effective communication between them.
As the adoption of XML Web services and service-oriented
architecture solutions accelerates, any edge we can gain when
constructing new applications or altering existing applications can
translate rapidly into commercial gain. Any improvements we can make
to the speed of assembly and construction of new applications, or
alterations to existing applications in a response to the changing
world around us, can impact this, and making the most effective use
of data through documents becomes an essential evolutionary step.
By making the information available to both humans and
machines simultaneously, we enable the rapid evolution of business
processes in a world where Darwin's evolutionary principles apply.
Can we really afford to be without the benefits of reacting
dynamically to document content within applications?
Native XML applications will pay long-term dividends as the
use of XML becomes pervasive. If companies are eager to evolve and to
react quickly in response to the changing business world around them,
they must use new approaches that allow this evolution to occur in a
natural, progressive, manner that echoes the evolution of the
business world.
Resources
To see an example of this approach visit our Web Services
Developer Centre at www.hyfinity.net and click through "Intelligent Web Services in Minutes."
About Steve Bailey Steve Bailey is chief e-business architect at hyfinity, an intelligent Web services platform company (www.hyfinity.com). He consults with enterprise customers developing real world business solutions harnessing Web services.
Reader Feedback: Page 1 of 1
Subscribe to the World's Most Powerful Newsletters
Subscribe to Our Rss Feeds & Get Your SYS-CON News Live!
Click to Add our RSS Feeds to the Service of Your Choice: