Zoek

Uitgebreid zoeken Artikelen per auteur

  

The Next Step Beyond XML

The Next Step Beyond XML

XML, the extensible markup language, has been one of the most quickly adopted technologies in the past 20 years. A cornerstone of web services and service oriented architecture (SOA), it has quickly replaced structured ASCII files, such as comma-delimited ASCII and fixed-length field ASCII files for data transfer. In fact, its successful use in SOA has many companies rethinking the roles played by CORBA (the common object request broker architecture) and EDI (electronic data interchange) as preferred methods for enterprise-level remote procedure calls.

XML's appeal derives from its simplicity. It is a flexible, standards-based, descriptive, text-based language for communicating data.

It is flexible in that it can communicate nearly any type of data. Even binary data can be MIME encoded and included in an XML text element, permitting nearly any type of data to be contained within an XML document.

XML's appeal derives from its simplicity: it is a flexible, standards-based, descriptive, text-based language for communicating data

It is standard-based in that the World Wide Web consortium (w3.org) has defined exactly what is considered a well-formed XML document, as well as those additional characteristics that constitute a valid XML document. It is because of these standards that it is relatively easy to write a parser that can extract the information from XML.

The descriptive nature of XML results from two characteristics. First, an XML document can include, or be supplemented by, a DTD (document type definition) or an XML schema. While DTDs where common several years ago, they are looking more like a deprecated technology today, compared with the several accepted forms of XML schema, which unlike DTDs, are also valid XML.

The second aspect of XML’s descriptive character is apparent to anyone who has viewed a formatted XML document. With the exception of MIME content, most data in an XML document is in a readily human readable form. It is even possible for a relatively new XML developer to understand, and even change, the contents of an XML document using a tool as simple as Windows Notepad.

Beyond XML: In-Memory Datasets

There is, however, a technology that goes beyond XML, even while embracing it. And it is this technology that I feel gives developers greater capability. I am talking about in-memory datasets, such as Delphi's ClientDataSet and the .NET framework's DataSet.

Both of these technologies can express their in-memory data in an XML format. And although the XML produced by ClientDataSets and .NET DataSets is not directly compatible, it shares many features that take XML to the next level.

There are, in fact, four characteristics of these two in-memory dataset classes that make them more useful than simple XML alone. These are:

  • Support for complex meta-data
  • A manageable change log
  • Support for state persistence
  • Ability to bind directly to controls and components

I discuss each of these features in more detail in the following sections.

Meta-data Support

Meta-data is, technically speaking, data about data. In the terminology of database developers, meta-data describes data type, data size and precision, data names, and in some cases constraints, default values, expressions and calculated fields, aggregate fields, and so forth.

In-memory datasets support more than twenty specific data types, these deriving from their need to support data that they may load from an underlying database, such as SQL server, Oracle, Interbase, Firebird, MySQL, Advantage Database Server, or nearly any other type of database. In the .NET framework, DataSets support .NET data types, but are influenced by the type of IDbDataAdapter used to fill the dataset. ClientDataSets support nearly all of the TField types supported in the VCL (visual control library).

As a result, in-memory datasets specifically distinguish between short integers and long integers, floating point numbers and BC (binary coded decimal) values. In addition, in-memory datasets provide support for fixed length strings, as well as memo fields and binary large object (blob) fields. Date, Timestamp, image, and Boolean fields are just a few of the additional data types you find supported by in-memory datasets.

By comparison, XML does very little to distinguish between different data types. For example, text fields are often identified as PCDATA (parsed character data). However, these fields have no native size limitations. In other words, a PCDATA field may be any length, as opposed to an in-memory dataset's support for a character field that can contain up to, but not more than, say 30 characters.

A Managed Change Log

While XML itself merely describes data, an in-memory dataset can keep track of the changes made to its data. These change logs are crucial to an in-memory dataset's ability to resolve its changes back to an underlying database server. In short, an in-memory dataset knows which records have been inserted, which have been deleted, and which have been modified. Furthermore, with respect to the modified records, an in-memory dataset knows both the original and current value of every field in a changed record.

While XML itself merely describes data, an in-memory dataset can keep track of the changes made to its data

XML, on the other hand, has no provision for this type of information. Of course, it is possible for you to programmatically track changes made to data, and write this information to your XML file, but a change log is not one of the standardized aspects of XML documents.

But in-memory datasets do not merely track changes to data. They let you manage those changes. This management includes the ability to determine precisely what changes have occurred (which records were inserted, deleted, and field-level modifications), revert changes to their prior state, cancel all changes, or commit those changes permanently, thereby erasing the change log.

With ClientDataSets, this change log is held in the Delta property. For .NET DataTables, you use the RowStateFilter of a DataView to access the change log. (DataSets are containers for one or more DataTables, though you can use DataTables without DataSets.)

To manage the change log for a ClientDataSet, you use its methods, such as RevertRecord, UndoLastChange, CancelChanges, and ApplyUpdates. In addition, you can use the RecordStatus, StatusFilter, and Fields properties to examine the change log contents.

With .NET DataTables, you use the methods of the DataTable and DataView classes to control the change log, including AcceptChanges, and RejectChanges. To examine the change log, you use the RowStateFilter and Rows properties.

Persisting In-memory Data

Of all the features supported by in-memory datasets, the ability to persist its state is arguably the most powerful. Specifically, it is possible to save the current state of an in-memory dataset to a file, web service, or memo field of a database, and then to restore that dataset at a later time to its exact prior state.

Some developers might argue that saving an XML document to a file, web service, or memo field of a database saves its state just as well as an in-memory set. And that would be true if it were not the in-memory dataset's support for a change log.

Of all the features supported by in-memory datasets, the ability to persist its state is arguably the most powerful

When you save an in-memory dataset, you can also save its change log. Restoring an in-memory dataset whose change log has been saved, restores both the data and the change log. There is simply no difference between an in-memory dataset before and after its having been persisted.

With a ClientDataSet, you save its state by calling its SaveToFile or SaveToStream methods. Similarly, you can save its state by reading the ClientDataSet's XmlData property, and storing that text somewhere (such as a text file or a memo field of a database table, or to a web service that stores the information). The following line of code shows how to persist a ClientDataSet to a memo field of a database.

ClientDataSet2.FieldByName('Hold Data').AsString :=
  ClientDataSet1.XMLData;

You restore a previously saved ClientDataSet by calling its LoadFromFile or LoadFromStream methods, loading a file or stream that you previously saved. Alternatively, you can assign to a ClientDataSet's XmlData property a string that was previously read from that property. For example, the following line of code demonstrates how you could restore a previously persisted ClientDataSet from a memo field of a database:

ClientDataSet1.XMLData :=
  ClientDataSet2.FieldByName('Hold Data').AsString;

You save the contents of a .NET DataSet using the WriteXml method. If you are writing to a file, you can pass the filename as the first parameter of the WriteXml method, and the type of XML write mode in the second parameter.

There are two write mode enumeration values that are typically used by developers writing out the contents of a DataSet. These are XmlWriteMode.WriteSchema and XmlWriteMode.DiffGram.

When you call WriteXML with the WriteSchema enumeration, DataSet metadata is written to the XML file in the form of a schema definition. This information is required in order for a DataSet loading the saved XML to accurately reconstruct the metadata of the DataSet.

Like WriteSchema, XmlWriteMode.DiffGram writes schema information into the XML file. DiffGram writes the change log information as well, which makes this enumeration essential if you want to persist the DataSet's state. Recall that the change log is crucial if you want to be able to continue managing it, or write the changes back to an underlying database.

The following example shows a DataSet and its change log being written to an XML file.

DataSet1.WriteXml(
  'c:\savedat.xml', XmlWriteMode.DiffGram);

Data Binding

While the three previously described features of in-memory datasets are what set them apart from simple XML documents, there is one more feature that deserves mention. You can bind controls and components directly to in-memory datasets, making their contents immediately accessible in a client application, or visible in a web application. While you can certainly use XML data in client and web applications, you typically must provide some form of parsing or transformation, and must manually track any changes made to this data, if you wish to resolve those changes to some persistent store (such as a database).

Specifically, in-memory datasets hold data in a relational format, unlike the hierarchical format supported by raw XML

There is another, related issue here. Specifically, in-memory datasets hold data in a relational format, unlike the hierarchical format supported by raw XML. While you might occasionally want to work with your data in hierarchical form, most user interfaces are based on a relational view of data.

Summary

XML is a standardized language for describing data. But XML lacks much of the richness that developers need to build sophisticated applications. Fortunately, both Delphi and .NET developers have access to in-memory datasets, which combine the power of XML's descriptive nature with a level of data awareness that out performs XML alone.

There is another way to look at this. In-memory datasets are the containers into which XML can be placed to provide advanced data-related operations. In addition, they are the containers from which XML can be retrieved in order to persist the state of the dataset. While the XML that these in-memory datasets produce is well-formed, and is in some cases valid XML, it also contains the richness added by in-memory datasets to support their superior features, such as a change log.

In service oriented applications (SOA), where XML is often used as the means of communicating messages between processes, the use of in-memory dataset-generated XML provides you with a ready source of additional computing power. By passing in-memory datasets between the processes of an SOA application as XML, the message received is richer in content than that if XML alone is used.

Geef feedback:
Verzend Commentaar