CXS Specification 1.2 (Compact XML Serialization)

CXS stands for Compact XML Serialization.

CXS is an XML application for data serialization developed by zehnet which offers the possibility of transferring complex data structures between programming environments.

The process of creating an XML representation of application data is called serialization.
The process of instantiating data from an XML representation is called unserialization.
CXS adapts PHP’s serialization structure to XML and extends this structure with some data types known from WDDX.

CXS is aimed to offer a representation of data structures which is compact (i.e. few bytes, little overhead) and effectively to parse in different programming environments.
The idea for CXS was to build up a serialization scheme for Flash (client) – PHP (server) communication, which can be handled easily in both environments and keeps traffic as low as possible.

CXS is offered under GNU Lesser General Public License and can be used freely.

Syntax

general characteristics of the syntax:

  • Every XML-element-name or XML-attribute-name has to be one ASCII-character (byte).
  • Allowed ASCII-characters are the letters a-z (ASCII: 97-122). That makes 26 possible names.
  • XML-element-names and XML-attribute-names are NOT handled case sensitive.
    A name therefore has to satisfy the following regular expression /[a-z]/i
  • An XML-element is allowed to contain either children or character data, not both.
  • child-elements of the serialized datatypes hash und object have to be parsed in a two step interval as key-value-pairs.

Data Types

CXS supports the following data types:

  • scalar types: string, integer, double (aka. float), boolean, date-time
  • compound types: array (numerical), hash (associative array), object
  • special types: null
  • resource types: binarystream

Due to the syntax rules described above, every data type is associated with a single ASCII character.
The following is a list of each data type, its associated character and the resulting XML representation in CXS.

type CXS abbreviation xml representation
scalar types:
(string) s <s>[value]?</s>
(boolean) b <b>[value]</b>
(integer) i <i>[value]</i>
(double) d <d>[value]</d>
(date-time) t <t>[value]</t>
compound types:
(array) a <a>[value]*</a>
(hash) h <h>[key-value-pair]*</h>
(object) o <o n=”[name]“?>[key-value-pair]*</o>
special type:
(null) n <n/>
resource types:
(binary) c <c>[value]</c>

Data Types and Serialization (CXS Packets)

boolean

a boolean expresses a truth value. It can be either true or false and has to be serialized as 1 or 0.

<b>1</b> // for true
<b>0</b> // for false

string

strings can be of arbitrary length and have to be UTF-8 encoded.

<s>hello world</s>

null

null specifies that a variable has no value or is undefined. Languages that do not have the concept of a null value should deserialize nulls as empty strings.

<n/>

date-time

date-time values have to be serialized according to ISO-8601 in the following format:
Y Y Y Y – M M – D D T H H : M M : S S + H H : M M
( Y = year, M = month, D = day, H = hour, M = minute, S = second, T = separator of date and time )
the last 2 HH indicate the timezone offset from UTC which has to be prefixed by a + (positive) or – (negative) to indicate the deviation direction.
A leading 0 is not obligatory for months, days, hours, minutes or seconds encodings.

<t>2000-02-15T09:30:25+01:00</t>

array

an array is a numerically (integer) indexed list of data elements, usually with a starting index value of 0.
Arrays will be serialized as one parent array XML element containing the list of data elements as CXS serialized children.

Example 1: The following array

$test = array( 'value1', 'value2', 3 );

will be serialized to the CXS packet

<a>
   <s>value1</s>
   <s>value2</s>
   <i>3</i>
</a>

Some programming languages demand each list element to be of the same data type.
If an array matches this restriction the flag attribute t=”[type]” can optionally be added to the CXS representation of the array. The value of the attribute t has to be the CXS type indicator (s for Strings, i for Integers, …)

Example 2: As the following array consists of elements which are all of the same type (integer), the attribute t=”i” can be added.

$test = array( 1, 2, 3 );

can be serialized to the CXS packet

<a t="i">
   <i>1</i>
   <i>2</i>
   <i>3</i>
</a>

hash

a hash is also known as associative array and is a string indexed list of data elements.
Hashes are serialized as key-value-pairs. Each hash element will result in one key and one value XML element.

Example 3: The following hash (associative array)

$test = array( 'var1' => 'value1', 'var2' => 'value2' );

will be serialized to the CXS packet

<h>
   <s>var1</s>
   <s>value1</s>
   <s>var2</s>
   <s>value2</s>
</h>

As some programming languages (e.g. PHP) do not strictly separate between array and hash, arrays can also be serialized in the CXS hash serialization scheme. CXS parsers could offer the option to define whether a strict differentiation should occur or not.

Example 4: The following array of example 1

$test = array( 'value1', 'value2', 3 );

will be serialized to the CXS packet

<h>
   <i>0</i>
   <s>value1</s>
   <i>1</i>
   <s>value2</s>
   <i>2</i>
   <i>3</i>
</h>

object

objects are serialized like hashes (associative arrays) except that the class name is stored as attribute n (name).
If a CXS parser is deserializing an object providing the name attribute it should look for a class with this name and instantiate the named object with the deserialized values if found. Anonym / Standard objects will be serialized without this attribute.

Example 5: The following php object $myClass

class test {
   var $var1 = 'value1';
   function test(){
      $this->var2 = 'value2';
   }
}
$myClass = new test();

will be serialized to the CXS packet

<o>
   <s>var1</s>
   <s>value1</s>
   <s>var2</s>
   <s>value2</s>
</o>

if the name attribute is provided deserializing would result in a new instance of test.

<o n="test">
   <s>var1</s>
   <s>value1</s>
   <s>var2</s>
   <s>value2</s>
</o>

binary

the binary datatype represents strings of binary data. Binary data has to be base64 encoded. This will ensure compatibility with general XML UTF-8 encoding.

Example 6: The php file resource

$file_resource = fopen('myFile', "r");
$CXS->serialize($file_resource);

will e.g. be serialized to the CXS packet

<c>Tm9ydG9uIEFudGlWaXJ1cyBoYXQgZm9sZ2VuZGV</c>

CXS Envelope

The CXS envelope surrounds CXS serialized data and indicates that all child nodes are CXS serialized.
If CXS serialized data is part of an XML document containing other data besides CXS, a parser should search for a CXS envelope and start parsing its children.
The CXS envelope is an XML-element with the name cxs and one attribute indicating the CXS version used for serialization. This attribute is obligatory.

<cxs v="1.1">
  <!-- CXS serialized data as child -->
</cxs>

If the whole XML document is CXS, you can omit the envelope and simply use a single CXS packet instead.

Document Type Declaration (DTD)

The following DTD can be used to validate CXS serialized data:

  <!DOCTYPE cxs [
     <!ELEMENT cxs (a|h|o|s|b|i|d|t|n|c)>
     <!ATTLIST cxs
               v CDATA #FIXED "1.1">
     <!ELEMENT s (#PCDATA)>
     <!ELEMENT b (#PCDATA)>
     <!ELEMENT i (#PCDATA)>
     <!ELEMENT d (#PCDATA)>
     <!ELEMENT t (#PCDATA)>
     <!ELEMENT n EMPTY>
     <!ELEMENT a (a|h|o|s|b|i|d|t|n|r|c)*>
     <!ATTLIST a
               l CDATA #IMPLIED
               f CDATA #IMPLIED>
     <!ELEMENT h (a|h|o|s|b|i|d|t|n|r|c)*>
     <!ELEMENT o (a|h|o|s|b|i|d|t|n|r|c)*>
     <!ATTLIST o
               n CDATA #IMPLIED>
     <!ELEMENT c (#PCDATA)>
  ]>
This entry was posted on Monday, March 7th, 2005 at 12:37. It is filed under flash, flex, php and tagged with , , , , , . You can follow any responses to this entry through the RSS 2.0 feed. You can leave a response, or trackback from your own site.

No Comments

 
Name:
Mail:
Website:
Comment:
Validation: validation