News | December 29, 2000

Batch data exchange using XML

Batch data exchange using XML By David Emerson

The ISA SP88 committee has been working to develop methods for the exchange of batch data such as master recipes, batch schedules and batch histories. To date the S88.02 drafts, for the lack of a better, widely accepted format, have focused on using relational database technology for the transfer of batch data. The wide acceptance of the eXtensible Markup Language (XML) since its release by the World Wide Web Consortium (W3C) in 1998 now provides an ideal candidate format that did not exist when the SP88 committee developed the S88.02 exchange table format.

This paper examines the use of XML for the exchange of batch related data. Examples are provided to demonstrate XML's ability to handle loosely coupled systems and complex, hierarchical data.

Contents
• Introduction
• S88.02
• XML
• DTDs & Schemas
• Batch control markup language
• Summary

Introduction
When work started on the S88.02 standard in 1995 the SP88 committee evaluated alternative formats and selected relational tables as the format for exchanging batch related. Since 1995 a great deal has changed in the world of information technology, especially the emergence of Internet technologies. One of the most talked about new Internet technologies is the eXtensible Markup Language (XML). While still in its infancy XML is experiencing rapid and wide acceptance, at least as evidenced by press releases and a trickle of products that utilize XML.

As XML enters the mainstream of software tools it should be examined for use in the process control industry. Where XML is strong at handling complex, hierarchical data its application in the batch control industry appears to be especially appropriate. At this time the S88.02 standard and its international equivalent IEC 61512-2 are in their final drafts and products based on them have not become widely released. So there is a window of opportunity to utilize a new technology like XML to implement batch data exchange building upon the work of S88.02, but without the complexity and difficulties of using storage based technology for exchanging data.
Return to Table of Contents

S88.02
The on-going work of the SP88 committee as documented in S88.02 Batch Control, Part 2: Data Structures and Guidelines for Languages, draft 14, dated May 1999 defines a method for exchanging batch control information between computer programs or systems. The method involves the use of relational tables or exchange tables as shown in Figure 1.

The S88.02 draft standard does not propose to define the internals of batch control, or other related systems. It states that only an interface specification is being defined, not the internal requirements for a system using the interface. Therefore the local data stores shown in Figure 1 may have different structures and contents in Tools A and B. The exchange tables represent a common format that can be used to exchange data.

The S88.02 exchange tables support four types of batch data:

    1. Master and control recipe information
    2. Process cell equipment information
    3. Schedule information, and
    4. Production information.

This list represents the most important and frequently handled data for batch control. However, no list can ever be 100% complete, especially over time as new needs develop. So the S88.02 draft also states that the exchange table definitions may be extended to support additional data. The additional data could be vendor specific data such as data addresses, end-user data that may support corporate business rules, or even some industry specific data such as may be useful for the pharmaceutical or other industry. The ability to extend the exchange tables is an acknowledgement that no one format can be all encompassing and fluid enough to handle all, or perhaps even most, applications over time. By permitting the expansion of the exchange tables the SP88 committee has enabled them to grow and be adapted as needed.

The selection of relational tables, as used in most common relational database management systems (RDBMS), was done in 1995 after careful examination of existing alternatives. Some of the alternatives considered were text files, ISO 10303 (STEP/Express), and the Standard Generalized Markup Language (SGML). Using text files would require the creation of syntax and processing rules both of which would add to the effort but not add value to the end user. The STEP standard was not used due to the spin-up time, complexity and cost associated with it and the Express language. Likewise the SGML format would have required expensive tools and time to learn it's complexity. In light of the alternatives the use of relational tables permitted little spin-up time and inexpensive tools (or at least readily available tools in member companies). However, at the time it was recognized that the use of relational tables:

  • while expedient, involve a certain level of complexity,
  • are actually better suited to storing data, than exchanging it,
  • suffer from relational database management systems that despite an ANSI SQL standard actuallydiffer in the syntax used and,
  • requires the bridging of multiple operating systems using third party tools that can represent significant costs.

Despite the drawbacks the use of relational tables was adopted since it represented the best and quickest path forward at the time. Also in S88.02 draft 14 is a high level data model that specifies the objects and attributes and their basic relationships that cover the concepts of S88.01. While actually developed concurrently with the exchange tables the data model can be seen as the top level abstraction of batch control related data with the exchange table being one implementation based on it. During development of Part 2 the SP88 committee realized that there may be other implementations based on the data model in the future as new technologies emerged. One example of this is the OPC Batch Custom Interface Specification from the OPC Foundation, which defines a COM, based interface for the exchange of batch and equipment data. Another possible implementation may be done using XML, which would complement the existing methods as shown in Figure 2. In the future there will be multiple transport protocols available for batch data exchange, the capabilities of each will probably differ so the market will decide which ones become widely used and which fall into disuse.


Return to Table of Contents

XML
The Extensible Markup Language was developed by the World Wide Web Consortium (W3C), the same group that maintains HTML. XML is defined by the W3C recommendation, REC-xml-19990210 Extensible Markup Language (XML) 1.0. XML is a language that describes data, not its presentation. Since it is an Internet protocol and, like HTML, is text based it is platform independent.

XML was designed for ease of implementation and for interoperability. The design goals included:

  • Straight forward use over the internet
  • Support a wide variety of tools
  • Easy to write programs to process XML
  • Few, if any options
  • XML documents should be human readable
  • XML documents should be easy to create

The XML syntax is similar to HTML in that it uses < >'s to identify text based tags. However, XML differs in many important details. The most noticeable of which is that the contents of the tags, called elements in XML, are user created. An example of this is how a date may be expressed in HTML vs. XML. In HTML a paragraph tag may hold a date:

In this case a program must know how to parse the string in the paragraph tag to determine that it is a date. If this is not done, the program will only be able to treat this as a string. However with XML the user could create a Date tag like:

By doing this a program can know that the string is a date and can use the string more intelligently, including searching for dates and reporting date related information. Of course that's the good news. The bad news is that without a standard set of tags it will be difficult for programs to process XML. This is compounded by the fact that XML tags are case sensitive and must match exact spelling. For example the following are all legal tags, but they are also all different tags:

So unless everyone uses the same spelling and capitalization it will be difficult to process each other's XML documents. However, the use of Document Type Definitions (DTDs) and schemas, which are described later on, help overcome this. XML also provides the ability to organize data elements in a hierarchy. This is done by nesting elements as shown here:

In this example a master recipe contains a header which in turn contains product ID. XML permits unlimited nesting levels and any number of elements nested directly underneath the same element.

These two samples provide a glimpse of how XML can be used. While there are many subtleties and advanced features of XML the concept of customized tags that describe data arranged in a hierarchy are some of the language's core strengths.

The W3C has issued other XML related recommendations and has more under development. These recommendations will add additional functionality to XML implementations. Of these the XSL Transformations (XSLT) recommendation and the Extensible Style Sheets (XSL) working draft facilitate the mapping and conversion, or transformation, of one XML document into another form. This should prove useful for displaying XML on web pages as well as transforming XML between different companies formats.
Return to Table of Contents

DTDs & Schemas
The existing XML 1.0 recommendation and the December 1999 working drafts of the XML Schema recommendation define DTDs (Document Type Definitions) and schemas respectively. DTDs and schemas are tools that address the issue of precise tag spelling and capitalization raised earlier in this paper, as well as many issues not directly addressed in this paper. While the XML 1.0 recommendation defines XML DTDs and these are widely used, it has also been recognized that DTDs are lacking in some areas. Therefore the working draft for XML schemas has been developed and after some use and refinement is expected to become a W3C recommendation. For the purposes of this paper the term schemas has been used since it is assumed that they will gradually supplant the current use of DTDs and enable more powerful XML applications.

A schema can be thought of as a definition of an XML based vocabulary. Using a schema a set of tags, including their precise spelling, can be defined, as well as their data types and the hierarchy, grouping, order and other organizational features of XML documents in which the tags are used. Together this can represent a customized language for an industry, company, or specific application. In fact there are currently efforts underway to create vocabularies for a number of industries and applications, including the financial industry, health and manufacturing industries.

Figure 3 shows how a schema can be used with an XML document. The schema and XML document are separate files, with the XML document referencing the schema. A software application that can read an XML document is referred to as an XML processor, this application could use the XML for any number of reasons including displaying it in a web browser or as input to a stand-alone program. The XML processor uses another piece of software called an XML parser, which reads the XML document, detects the reference to a schema, accesses the schema and verifies that the XML document complies with the schema's requirements. If the XML document complies, it is said to be valid. Once the parser validates the XML document the information in the document is passed to another part of the XML processor application where it is used.

The use of schema's permit XML to be extended by enabling the creation of other markup languages, each of which can be called a vocabulary.
Return to Table of Contents

Batch control markup language
A set of XML schemas based on the S88.02 data model and exchange tables could be created for using XML to exchange batch control related data. A logical set of schemas would be:

  • master and control recipes,
  • batch schedules,
  • equipment definitions, and
  • production information.

This grouping would match the data models and exchange table organization. Collectively this group of schemas could be called a Batch Control Markup Language (BatchML or BCML for short) and used as the basis for exchanging batch control related data in a variety of environments and applications.

Figure 4 shows how an industry standard schema could serve as the basis for many related schemas, each of which will differ, in hopefully minor aspects.

In this example a batch processing company may develop a corporate standard XML schema for master recipes. This corporate standard may use some of the company's in-house terminology, perhaps in the master recipe, schedule or production information corporate schemas. By doing this a batch processing company would give suppliers and vendors that do business with them the ability to view their data elements and hierarchies to provide a mapping with the suppliers or vendor's XML schema. Once a batch processing companies XML schema's are established then products from different vendor's could be interfaced with the corporate standard and enable the movement of batch processing data between systems and to external suppliers and customers as needed. Each vendor's XML schema may be customized for their product's special features.

While the use of many similar, but different XML schemas may appear confusing, the use of XSL stylesheets can make conversion between similar schemas relatively straightforward. It is possible that a future set of mapping and transformation tools, based on XSLT and XSL technology, may permit the automation of the conversion process. This would surely make the integration of different applications within the plant floor and between the plant floor and business systems less costly.
Return to Table of Contents

Summary
The work being done by the ISA SP88 committee provides a solid basis for the development of different interfaces for the exchange of batch control data. The emergence and wide acceptance of XML in the information technology industry represents a valuable and well suited technology that can be used with the work of the SP88 committee. As the OPC Foundation has done in the development of their batch custom specification, a new group could also use XML to create a Batch Markup Language. A Batch Markup Language coupled with new and existing tools and technology could be used to lower the cost of integration of batch control related, operational and business applications in the batch processing industry.
Return to Table of Contents

References
ANSI/ISA-S88.01-1995, Batch control- Part 1: Models and Terminology
IEC61512:1997, Batch control- Part 1: Models and Terminology
ISA-dS88.02-1999, Batch control- Part 2: Data Structures and Guidelines for Languages, draft 15, May 1999
REC-xml-19980210, Extensible Markup Language, published by the World Wide Web Consortium, http://www.w3.org/TR/1998/REC-xml-19980210
WD-xmlschema-1-19991217, XML Schema Part 1: Structures Working Draft, published by the World Wide Web Consortium, http://www.w3c.org/TR/1999/ WD-xmlschema-1-19991217
WD-xmlschema-2-19991217, XML Schema Part 2: Datatypes Working Draft, published by the World Wide Web Consortium, http://www.w3c.org/TR/1999/ WD-xmlschema-2-19991217
REC-xslt-19991116, XSL Transformations (XSLT) Version 1.0, published by the World Wide Web Consortium, http://www.w3.org/TR/xslt
WD-xsl-20000327, Extensible Stylesheet Language (XSL) Version 1.0 Working Draft 27, published by the World Wide Web Consortium, http://www.w3.org/TR/xsl/
XML, A New Web Technology for Structured Data Exchange, by David Emerson, presented at IMS Expo 1999
Return to Table of Contents

About the author
David Emerson is a Senior System Architect with Yokogawa Corporation of America, 2155 Chenault Drive, Suite 401, Carrollton, TX 75006. Tel: 972-417-2753; Fax: 972-416-3966; e-mail: emerson@dallas.yca.com.



This article is provided courtesy of the ISA - the Instrumentation, Systems, and Automation Society.