XML Schema Peter Komisar latest version .5.1   ©

references: The XML Schema Primer http://www.w3.org/TR/xmlschema-0, The XML
Schema Structures (Part 1) http://www.w3.org/TR/xmlschema-1, 'XML Schema Part 2:

Datatypes', www.w3.org/TR/2001/REC-xmlschema-2-20010502/Examples from 
W3Schools Web site, www.w3Schools.com, 'XML and Web Services Unleashed', R.
Schmelzer et.al., 'XML in a Nutshell', E.R Harold & W.S. Means, Professional XML
Schemas, J.Duckett et.al.


Overview



In May, 2001 the W3C published their recommendation for the XML Schema Definition
Language. The specification allows the creation of different sorts of simple and complex
XML elements that govern typing in an associated XML 'instance' document . The
recommendation also introduces a large set of data types that allow data in an XML
document to be strongly typed.

Official Documents

In addition to general information that is available at http://www.w3c.org/TR/Schema,
there are three documents frequently cited as sources and official documents for the
XML schema recommendation. They are listed below.


What is a Schema?

Definition: A schema is a set of rules that is used to govern data structure and content.



The schema Element



The <schema> Element

The root element of an XML schema document is the 'schema' element.
The schema element houses the set of schema elements that are used
to create an XML schema definition.

The following example shows a 'skeleton' schema. Notice in this example
the official namepace for the World Wide Web's consortium's XML schema
language is specified and assigned to the namespace prefix 'xs'. This is a
hallmark of a schema document. Also shown is the 'targetNamespace'
attribute which holds a URI that represents the namespace that will be
associated with this particular schema. 


A Skeleton Form of the 'schema' Element

<xs:schema
    xmlns:xs="http://www.w3.org/2001/XMLSchema"  
// schema language namespace
    targetNamespace="http://www.example.com/example">
// namespace that this schema will be associated with
  . . .
</xs:schema>

A complete schema may be represented in XML by one or more 'schema documents',
that is, one or more, <schema> element, information items.

// points out that a schema can be comprised of a { set } of schema documents




Complete Schema Form   // reference

// the 'official' 'schema' element form  published in the W3C schema ,
// http://www.w3.org/TR/xmlschema-1/#Schemas


Form of the schema Element

// quoted from the W3C Schema 1 document the http://www.w3.org/TR/xmlschema-1/#Schemas

<schema
  attributeFormDefault = (qualified | unqualified) : unqualified
  elementFormDefault = (qualified | unqualified) : unqualified
  blockDefault = (#all | List of (extension | restriction | substitution))  : ''
  finalDefault = (#all | List of (extension | restriction))  : ''

  //  the single quotes form an empty string and provide the default value
  // the first four attributes are global to a document and control various default

  id = ID
  version = token

  // id & version attributes are optional and "for user convenience"                                               

  targetNamespace = anyURI

  // associates schema element element definitions with a namespace                  

  xml:lang = language
                                                 
  {any attributes with non-schema namespace . . .}>

  Content: ((include | import | redefine | annotation)*,
  // describes different types the document may hold
  (((simpleType | complexType | group | attributeGroup) | element | attribute | notation), annotation*)*)
</schema>




The schema element is a skeleton container for it's components which may be sets of:

// may include imported definitions and nested elements

// <include> and <import> elements are not schema constituent components but
// rather are directives that serve the same purpose as in other programming dialects
// to bring in definitions that are defined externally to the local document



XML Schema Namespaces

Schema has three namespaces that are 'hallmarks'  of the application and will
typically be present in schema and schema instances.

The XMLSchema namespace is used to identify elements that are part of the schema application
itself. These are the widgets XML Schema language uses such as the <schema>, <element>,
<attribute> and  <complexType>.  This namespace definition appears in the schema document.


The XMLSchema-instance namespace is found in XML instance documents
The XML
Schema Instance namespace is most often used to qualify the 'xsi:schemaLocation'
attribute or the 'xsi:noNamespaceSchemaLocation' attribute. It is also used to specify a
'nil' value, in the form xsi:nil. This is a form that explicitly states there is no given entry
for a field. (This differentiates from the default value that an empty string supplies, which
is an empty string which may be an incorrect type for a given element. ) The Schema
instance namespace is also used with the 'xsi:type' attribute to specify a derived type
that a schema defines.

Following is an example of how the xsi:nil attribute is used

W3C Example     <xs:element name="shipDate" type="xs:date" nillable="true"/>
                         // boolean attribute 'nillable' set to true

A subsequent 'shipDate' tag in a document instance can then have it's 'nil' value set
to true.

W3C Example       <shipDate xsi:nil="true"></shipDate>


// A third namespace, xmlns:sxdatatypes="http:www.w3.org/2001/XMLSchema-datatypes
// is defined that may be used in place of the general XMLSchema namespace for qualifying

// built in data types. It is a specialized namespace that could be used to represent  Schema
// data types in applications other than XML schema.

Schema Samples

The following W3Schools example is a skeleton of more typical form that the
schema tag might take.
 

Example 1 From the W3Schools Website


<?xml version="1.0"?>
<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema"

<!-- specifies the official XML Schema namespace and associates it with the 'xs' prefix -->
targetNamespace="http://www.w3schools.com"
<!-- specifies the namespace to which elements of this particular schema are associated -->
xmlns="http://www.w3schools.com"

<!-- creates a default namespace -->
elementFormDefault="qualified">

<!-- where different elements go -->

</xs:schema>

Following is an example of a simple schema and a corresponding document
instance that has been verified against the latest standards. ( This time we
have a simple element included. )


Example 2 
     // save to SomeName.xsd

<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema"
           targetNamespace="www.example.com"
           elementFormDefault="qualified"
       
           >
    <xs:element name="elem1" type="xs:string"/>
</xs:schema>
 

Next we show the corresponding 'instance' document.
 

Example 2 XML Instance for the above Schema      // was saved as SomeName.xml
 

<?xml version="1.0"?>

<elem1 xmlns="www.example.com"
       xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
       xsi:schemaLocation="www.example.com  SomeName.xsd">
It's Schema Time!
</elem1>

// again note the default namespace provided bring the instance into
// 'qualification' with the target namespace specified

While just serving as identifiers, the W3C consortium wish you would use real
URIs.  To this end some have been reserved for the purposes of XML. They
are listed b below.

Reserved Pages  http://www.example.com
                          http://www.example.net
                          http://www.example.org



Following is a complete W3Schools Example of a Short XML Schema.
By convention, this file would be saved to a file with an .xsd extension,

// Like DTDs, schema  has it's own file extension by convention. Unlike
// DTD's schemas are 'real' xml documents, written in XML.  

 

A Short Schema Example From W3Schools

<?xml version="1.0"?>   <!-- schemas share the hallmark of an xml file -->
<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema"
targetNamespace="http://www.w3schools.com"
xmlns="http://www.w3schools.com"
elementFormDefault="qualified">

<xs:element name="note">
<xs:complexType>
<xs:sequence>
<xs:element name="to" type="xs:string"/>
<xs:element name="from" type="xs:string"/>

<xs:element name="heading" type="xs:string"/>
<xs:element name="body" type="xs:string"/>
</xs:sequence>

</xs:complexType>
</xs:element>       
<!-- this whole section is one element definition! More on this later! -->

</xs:schema>

 
( If you are going to test run this code, make sure you use the same
name as is referenced in the instance document. Below it is called
note.xsd )

The XML Instance Document

Following is the W3Schools document that references the above XML Schema.

A W3Schools XML Example Implementing the Schema

This XML document has a reference to an XML Schema:

<?xml version="1.0"?>
<note
xmlns="http://www.w3schools.com"                                <!-- the default namespace declaration -->
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"  <!-- the XMLSchema-instance  namespace-->    
xsi:schemaLocation="http://www.w3schools.com     note.xsd">  <!-- the schema's location  Notice two parts-->     

<to>Tove</to>
<from>Jani</from>
<heading>Reminder</heading>
<body>Don't forget me this weekend!</body>
</note>

Notes on the Instance Document

Inside the instance document's root element, <note>,  there are two namespace
assignments and a 'schemaLocation' attribute.

1)  The instance document has created the 'xsi' namespace prefix that represents
     the special W3C namepace, "http://www.w3.org/2001/XMLSchema-instance"
     used in conjunction with the 'schemaLocation' attribute.
 
     (Alternatively, there is a 'noNamespaceSchemaLocation' attribute that can be
      used when no namespace has been declared in the schema document. )

Example     xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"


2)  Both the schema and the instance have created a default namespace
     for their documents.

Example     xmlns="http://www.w3schools.com"

Effectively, this makes all elements which are not prefixed, 'qualified' and
belonging to the default namespace. (This keeps it in sync with the assignment,
elementFormDefault="qualified" , found in the corresponding schema.)


3)    Once created, we use the 'xsi' namespace prefix in conjunction with the
      'schemaLocation' attribute to specify the actual location of the schema is
      entered. This is shown in the example below that extracts the two lines
      that show this aspect of the instance document. ( The prefix is highlighted.)

Example

xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://www.w3schools.com     note.xsd">


4)  The 'schemaLocation'  element is specified in two parts. This can be seen
      in the second line of the above example.  This points to what might seem
      at first glance to be a slightly ambiguous part of the Schema Language
      specification. Although it would appear that this is a path / file format, the
      XSL specification does not require that these identifiers are related. For
      example, the following example demonstrates that the two values do not
      have to be related. It is really up to the parser to interpret these identifiers. 


Example
     xsi:schemaLocation="http://www.example.net   note.xsd">


      The actual XML Schema specification states that the first name should
     "hint" as to where the parser can find the schema. It is reasonable to expect
      that the first identifier in some way suggests where the 'note.xsd' file will
      be found but there is no obligation to be so accommodating!



Global Elements & Attributes  // global is the scope of the schema element

Global elements are children of the 'schema element'. Because attributes
are themselves elements, the same can be said about attributes. Attributes
are global when they are children of the schema element.

Global element or attributes can be referenced using the 'ref' attribute.
The following is an incomplete example that emphasizes the use of the
'ref' attribute.

<>Example        . . . .
                    <xs:element ref="History">
                    // element references global 'History' element

                    . . . .
                    <xs:element name="History" >
                         <xs:complexType>
                               <xs:sequence >     
                       // etc.    



Where to go to validate

Java Validating Parsers

1) The Java world now supports a unifying architecture for parsing and transforming
XML documents, called JAXP or the Java API for XML Processing. Within this
context several parsers can be made available that are able to validate schema.
Because JAXP is our next topic we will leave looking at validating with the JAXP
API to the next note.

DecisionSoft's Online Validation Tool  

2) If you don't have the the Java JDK or Xerces on your machine you can access
the same functionality by using 'DecisionSoft's online tool that runs your documents
against the Xerces parser for you. Go to the following link and using the browsing
functions to find your files.

DecisionSoft's Schema Validation Page

http://tools.decisionsoft.com/schemaValidate.html


Microsoft's XML Validation Page

3) Microsoft has an XML validation page, (which most of you have found will add
    a link for later. ) It is important that you are using IE6 or have downloaded the
    MSXML 4.x package which supports the latest version of the W3C's standards.
    The MSN XMLValidator is available at the Microsoft Developer Network.

http://msdn.microsoft.com/downloads/samples/internet/xml/xml_validator/default.asp?frame=true


XMLWriter

4) XMLWriter, as was pointed out in class is based on Microsoft's XML engine so
    for it to do the most recent forms of validation, it too must have the MSXML 4.x
    package present.


XMLSpy

5)  XMLSpy is an all-in-one dedicated XML application that is a leader in the field. IBM also
    supplies validation parsers which can be found at their web site.




Complex & Simple Types



Simple & Complex Types


An XML document will consist of a main element and sub-elements.
Sub-elements may in turn contain other sub-elements. In XML schema
descriptions, elements that contain sub-elements or carry attributes are
generic complex types. Elements that contain primitive data, numbers,
strings, dates etc.) and that do not contain any sort of sub-element are
considered generic simple types. Attributes themselves are represented
as simple types. It follows then that only complex types can have attributes.

// If the element will just hold primitive data --> they are simple type
//  If an element has attributes or other elements --> then they are complex types

A  <complexType> element is involved in creating a generic complex type
element. There is a parallel situation that exists for generic simple types
where a <simpleType> element is used.

We begin by describing the simple element type that is used to create
basic definitions in our schema specifications.

Simple Elements     //  contains text data and not other elements or attributes

A simple element is an XML element can only contain text. It cannot contain
any other elements or attributes. "Only text" in this context means one of a
number of different types, whether 'built-in' types like 'token', 'string' or 'boolean'.
Custom types or types derived from built-in basic types can also be used to
specify content type. Following is the base syntax for a simple element.

Simple Element Form  <xs:element name="xxx" type="yyy"/>


Following are three simple elements as they might appear nested inside
an XML Schema document.


Simple Element Examples
         . . . .
        <xs:element name="treeType" type="xs:string" />
        <xs:element name="leafPoints" type="xs:integer"/>
        <xs:element name="idDate" type="xs:date" />
         . . . .

The above schema definitions would map to corresponding instance
element such as the following.

Corresponding Elements in Instance Document

         . . . .
        <treeType> Basswood </treeType>
        <leafPoints>   1       </leafPoints>
        <idDate> 2005-06-01  </idDate>
         . . . .


// We will see late that simple elements can be further 'extended' or 'restricted'
// to create custom sets of simple types, using the 'simpleType' element. 


Complex Elements

Several XML schema elements work together to create complex types. Complex
types may contain nested elements and attributes. They may also be formulated
to create a 'mixed' form that contains elements mixed with text. The complex type
is hallmarked by the presence of the <complexType> tag.

The complexType element - The <complexType> element acts as a container
for a set of elements which may include attribute declarations. The following
skeleton example is simplified by leaving out nested elements and shows a
named version of the element called 'skeleton'.  This name can be assigned
to the type attribute of an element. ( Note the 'xsd' prefix is an alternative to the
'xs' prefix that is conventionally used in XML schema namespace declarations.
It is popularly being replaced by the shorter 'xs' prefix which you also see in
many examples.)

The following example is also referred to as a complex type declaration. 

 

Example of a Named Form of a Complex Type Declaration 

<xsd:complexType name="skeleton" >
    <!-- needs a top-level nested compositor
          either <sequence>, <all> or <choice> -->

  <!-- elements nested in the compositor -->

 </xsd:complexType>

The above example is the named form of the complexType element, the name
here is "skeleton". This allow the type to be used as a template that can be
referenced at some other point in the schema document as is shown is the
second part of the example below. Notice the 'type' attribute is used to specify
what the type of the element named 'PostalExtension' will be.


Complex Element Example

// elsewhere in the schema

<element name="PostalExtension" type="USZip"> 

 
  
The Anonymous or Unnamed Form

Elements are frequently composed with anonymous or 'inlined' 'complexType'
elements nested inside them. The following example from 'XML in a Nutshell'
shows this form. Notice, the <complexType> element in the example has no
name.
 

Abbreviated Example of the Anonymous Form of a Complex Element

// from 'XML in a Nutshell', Harold & Means

<xs:element name="fullName">
<xs.complexType>   // here the complexType is not named and is 'inlined' into another element
<xs:sequence>
<xs:element name="first" type="addr:nameComponent"/>
<xs:element name="lastt" type="addr:nameComponent"/>
</xs:sequence>
</xs.complexType>
</xs:element >

The next example shows an inlined complexType element that is made
up of simple elements that use built-in schema types.


Example 2

  <xs:element name="root">
   <xs:complexType>
       <xs:sequence>
        <xs:element name="treeType" type="xs:string" />
        <xs:element name="leafPoints" type="xs:integer"/>
        <xs:element name="idDate" type="xs:date" />
       </xs:sequence>
   </xs:complexType>

    </xs:element>



The 'element' element - In many examples so far we have seen the <element>
element. It is used both as a nesting unit and as a base to house different
complex types. Consider the inlined example above where the <complexType>
element is wrapped inside a named element while at the same time there are
nested simple type elements.
 

Example of an Element Element

<xsd:element name="name"   type="xsd:string"/>
 

The 'attribute' element - The 'attribute' element is used to create attribute values
for the 'complexType'. In the following attribute example, the attribute value is fixed
and must be "US". Notice that the attribute type is a predefined, XML schema
simple type called NMTOKEN. All attribute declarations must reference simple
XML data types. (As just mentioned we will see how we custom define simple
types later.) Unlike element declarations, attributes must be simple and cannot
contain other elements or other attributes. 
 

Example of an Attribute Element

<xsd:attribute name="country" type="xsd:NMTOKEN" fixed="US"/>
 

By looking at the following example of a complex type you can see all three
element types used in conjunction with each other. The example includes
the 'sequence' element which contains an ordered grouping of elements.
The elements grouped in a sequence must appear in the exact order that
they are specified within the 'sequence' element of the schema.
This has
an effect equivalent to an 'Element Only' definition in a DTD.


W3 Example of a ComplexType Declaration  // from the W3C example

<xsd:complexType name="USAddress" >
  <xsd:sequence>
   <xsd:element name="name"   type="xsd:string"/>
   <xsd:element name="street" type="xsd:string"/>
   <xsd:element name="city"   type="xsd:string"/>
   <xsd:element name="state"  type="xsd:string"/>
   <xsd:element name="zip"    type="xsd:decimal"/>
  </xsd:sequence>
  <xsd:attribute name="country" type="xsd:NMTOKEN" fixed="US"/>
 </xsd:complexType> <!--  just above is the attribute element added to this example -->


This definition can then be referenced in another tag from inside the schema
document. For instance, in the W3C Schema example at the end of the note,
the above defined type is referenced as follows.


Complex Type Declaration Referenced Later in Schema

<xsd:element name="shipTo" type="USAddress"/>


The specification that is dictated by this schema element is adhered to by
the corresponding element in the schema document. ( This code can be found
in the second part of the W3C example at the end of the note.)

Valid Form of Element Referencing Complex Type in Subsequent Instance Document

<shipTo country="US">
<name>Alice Smith</name>
<street>123 Maple Street</street>
<city>Mill Valley</city>
<state>CA</state>
<zip>90952</zip>
</shipTo> 


Occurrence Constraints For Elements

Occurrence constraints decide exactly how many times an element can occur.
This is is an improvement over DTDs which can limit occurrences of elements
to zero, one or many.

The following table captures a good comparison of DTD cardinality controls
compared to those used in XML Schema.

Table Comparing DTD Cardinality Controls with Values of minOccurs & maxOccurs

// facsimile of a similar table found in 'Professional XML Schemas', J.Duckett et.al, Wrox Press

 DTD Cardinality
 Symbols

 minOccurs
 value

 maxOccurs
value

 Element Occurences

 none // default

 1

 1

 Once & only once

 ?

 0

 1

 Zero or one

 *

 0

 unbounded

 Zero or more

 +

 1

 unbounded

 One or more

// unbounded is assigned as a literal value inside double quotes i.e. maxOccurs="unbounded"


Schema Cardinality Controls

'minOccurs' &  'maxOccurs' -  The attributes that control element occurrences
'minOccurs' and 'maxOccurs'. The default value for both the 'minOccurs' and
the 'maxOccurs' attributes is 1. With 'minOccurs' at the default value, the element
is required to appear at least once. With maxOccur set to the default value of 1,
the element is only allowed to appear at most only once. 'maxOccurs' can be set
to whatever value is appropriate. By setting minOccur to 0, the element because
optional. The next example explicitly states that this element is optional.


Example
  <xsd:element ref="comment" minOccurs="0"/> 


'unbounded'
- The 'unbounded' value can be used with 'maxOccurs'  to emulate
the behaviour found in DTDs when * and + are used.

The following is a modification of the W3Schools example allowing the note to
specify up to 5 receivers. We have explicitly stated the default vale for the
'minOccurs' attribute.

W3Schools Schema Example Modified

<?xml version="1.0"?>              
<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema"
targetNamespace="http://www.w3schools.com"
xmlns="http://www.w3schools.com"
elementFormDefault="qualified">

<xs:element name="note">
<xs:complexType>
<xs:sequence>
<xs:element name="to" type="xs:string"  minOccurs="1" maxOccurs="5"/>
<xs:element name="from" type="xs:string"/>
<xs:element name="heading" type="xs:string"/>
<xs:element name="body" type="xs:string"/>
</xs:sequence>
</xs:complexType>
</xs:element>       

</xs:schema>


Matching Schema Instance

<?xml version="1.0"?>
<note
xmlns="http://www.w3schools.com"                          
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://www.w3schools.com Note.xsd">

<to>Jack</to>
<to>Jill</to>
<to>Elvis</to>
<to>Nephilum</to>
<to>Santa</to>
<!-- <to>Hercules</to> adding  this 'to' element is one too many & won't validate -->

<from>Elmo</from>
<heading>Clarification</heading>
<body>Concerning Fairy Tales, Myths and Legends</body>
</note>
 


Attributes Behave Differently

Attributes, on the other hand, can appear once or not at all. They are controlled
with a different syntax. The 'use' attribute can be used with them to make their
appearance 'required', 'optional' or 'prohibited' which is exampled below`.

 


Attributes 



Attribute Simple Form

Attributes have the same form as simple elements.  A simple representation
of the form of attributes is as follows.

Simple Form of the XML Schema Attribute

<xs:attribute name="xxx" type="yyy"/>
 

The following, more imposing W3 representation indicates that attribute definitions
have many exotic features. They may be set to defaults or have fixed values.
Attributes may be qualified or unqualified. They may have a unique identifier,
a name and perhaps may reference another type definition. Their use may
be specified as optional, prohibited or required. They may be annotated
( documented) and may be custom composed using a <simpleType > element.

The W3 Representation of an attribute element  // for reference 

<attribute
  default = string
  fixed = string
  form = (qualified | unqualified )
  id = ID
  name = NCName   // An NCName as defined by XML-Namespaces
  ref = QName
  type = QName
  use = (optional | prohibited | required) : optional
  {any attributes with non-schema namespace . . .}>
  Content: (annotation?, (simpleType?))
</attribute>
 

Example

<xs:attribute name="age" type="xs:positiveInteger" use="required"/>
 


Default Values for Elements and Attributes



// Attributes can have a default value OR a fixed value specified

Attribute Default Values

In XML Schema Language, there is an actual 'default' attribute which is used
to assign a default value to an attribute. Default values only make sense if
attributes are optional. In fact in XML Schema Language, it is an error to
specify a default for anything but an optional value.
 
Both 'attribute' elements and 'element' element have the 'default' attribute and
may use this attribute to provide a default value With attributes, the default
value will be assigned based on whatever is provided in the instance document.
If the attribute does not appear in the instance document, the schema processor
provides the default attribute value that was supplied in the schema.

In other words, a default value is automatically assigned to the attribute when
no other value is specified. In the following example the default value is "EN":

<xs:attribute name="lang" type="xs:string" default="EN"/>

// in the instance document, if the attribute wasn't provided it would default to "EN".
 

Element Default Values - When an element is declared with a default
value, it is given the value specified in the element's content area as found
in the instance document. If the element appears without content, (is empty),
the schema provides the value that is given to the default attribute. If the
element doesn't appear at all in the document, however, the schema doesn't
provide the element at all.

The W3C paper summarizes,

  " Default attribute values apply when attributes are missing,
    and default element values apply when elements are empty."
 
They might have added, but felt it went without saying, 'nothing is applied
if elements are missing'.


Creating Optional and Required Attributes

The 'use' attribute is used to control whether an attribute may, must or can't
appear in a schema governed document.
All attributes are optional by default.
To explicitly specify that the attribute is optional, the "use" attribute is used as
is shown in the following W3Schools example.


W3Schools Example
 

<xs:attribute name="lang" type="xs:string" use="optional"/>
 

To make the attribute required the use attribute is assigned the 'required'
value.

Example

<xs:attribute name="lang" type="xs:string" use="required"/>

There is a third value that can be assigned to the 'use' attribute. This is the
'prohibited' value, indicating the attribute cannot appear at all in the parent
element.

//  testing showed default didn't need to be specified in optional case, which may be
// interpreted as ' optional with no default specified'


The Fixed Attribute

Both attribute and element declarations use the 'fixed' attribute to 'fix' specific
values. We saw this with the country attribute which was declared with the fixed
value, "US". Consider the following example. The use attribute is not specified,
therefore, it has the default value which is 'optional'. Accordingly, if the country
appears it must have the value 'US'. If the country attribute doesn't appear the
schema processor will provide the value, 'US'.

// optional mixed with fixed means that the instance may or may
// not be specified however in both cases it will be the fixed value.


Example
 

<xsd:attribute name="country" type="xsd:NMTOKEN" fixed="US"/>


A fixed value is automatically assigned to the attribute. You cannot specify
another value. In the following W3Schools example the fixed value is "EN":


W3Schools Example

<xs:attribute name="lang" type="xs:string" fixed="EN"/>


As talked about earlier in other words, the idea of providing a fixed value, or,
providing a default value is mutually exclusive. It is an error to declare both
fixed and default attributes in an element.

Schema Data Types Are Used by Attribute & Simple Elements

We saw in the schema we looked at earlier, examples of simple types being
used in both elements and attributes. Simple types are like the primitive types
of the Java programming language. They are the atomic and prime types of the
XML Schema Language. For instance in the following two examples, the first
being an element and the second an attribute, we see the type decimal and the
type NMTOKEN being used.
 

Examples of Simple Types used in both Element and Attribute

<xsd:element name="zip"    type="xsd:decimal"/>
 <!--    . . . .  -->
<xsd:attribute name="country" type="xsd:NMTOKEN" />



Built in Schema Data Types



The following table lists the imposing collection of type definitions that are described
in the XML Schema Primer at hosted at the W3C site.  Although the list is imposing,
you will note that to built an equivalent set of types in a standard programming
language would call for the creation of a library of classes to represent each of
the variations provided. XML does this at the primitive type level. 

Simple Types Built In to XML Schema // http://www.w3.org/TR/xmlschema-0
 

 Simple Type

 Example (comma delimited)

 Notes

 string

 Confirm this is electric

 

 normalizedString

 Confirm this is electric

 

 token

 Confirm this is electric

 

 byte

 -1, 126

 

 unsignedByte

  0, 126

 2

 base64Binary

  GpM7 


 hexBinary 

  0FB7 


 integer

 -126789, -1, 0, 1, 126789 

 2

 positiveInteger

 1, 126789 

 2

 negativeInteger

 -126789, -1 

 

 nonNegativeInteger

  0, 1, 126789 

 2

 nonPositiveInteger

 -126789, -1, 0 

 2

 int

 -1, 126789675 

 2

 unsignedInt

 0, 1267896754 

 2

 long

 -1, 12678967543233 

 2

 unsignedLong

 0, 12678967543233

 2

 short

 -1, 12678 

 

 unsignedShort

 0, 12678 

 2

 decimal

 -1.23, 0, 123.4, 1000.00 

 2

 float

-INF, -1E4, -0, 0, 12.78E-2, 12, INF, NaN 
 equivalent to single-precision 32-bit floating  point

 2

 double

 -INF, -1E4, -0, 0, 12.78E-2, 12, INF, NaN 
 equivalent to double-precision 64-bit 
 floating point

 2

 boolean 

 true, false, 1, 0 


 time

 13:20:00.000, 13:20:00.000-05:00 

 2

 dateTime

 1999-05-31T13:20:00.000-05:00 
 ( May 31st 1999 at 1.20pm Eastern 
 Standard Time which is 5 hours behind
 Co-Ordinated Universal Time )

 2

 duration

 P1Y2M3DT10H30M12.3S 
 1 year, 2 months, 3 days, 10 hours, 30
 minutes, and 12.3 seconds 

 2

 date

 1999-05-31 

 2

 gMonth

 --05--,  May

 2, 5

 gYear

 1999 

 2, 5

 gYearMonth

 1999-02 the month of February 1999,
 regardless of the number of days

 2, 5

 gDay

 ---31,  the 31st day

 2, 5

 gMonthDay

 --05-31,  every May 31st

 2, 5

 Name 

 shipTo , XML 1.0 Name type


 QName 

 po.USAddress, XML Namespace QName


 NCName

 USAddress, XML Namespace NCName
 i.e. A QName without prefix and colon


 anyURI 

 http://www.example.com/


 language

 en-GB, en-US, fr, valid for xml:lang as defined in XML 1.0


 ID 

 XML 1.0 ID attribute type

 1

 IDREF

 XML 1.0 IDREF attribute type

 1

 ENTITY

 XML 1.0 ENTITY attribute type

 1

 ENTITIES

 XML 1.0 ENTITIES attribute type

 1

 NOTATION

 XML 1.0 NOTATION attribute type

 1

 NMTOKEN

 XML 1.0 NMTOKEN attribute type

 1

 NMTOKENS

 XML 1.0 NMTOKENS attribute type
  i.e. a whitespace separated list of 
 NMTOKENs

 1

 Notes From the Table

 (1) To retain compatibility between XML Schema and XML 1.0 DTDs, the simple types
ID, IDREF, IDREFS, ENTITY, ENTITIES, NOTATION, NMTOKEN, NMTOKENS
should only be used in attributes. 

(2) A value of this type can be represented by more than one lexical format, e.g. 100 and
1.0E2 are both valid float formats representing "one hundred". However, rules have been
established for this type that define a canonical lexical format, see XML Schema Part 2.

(3) Newline, tab and carriage-return characters in a normalizedString type are converted to
space characters before schema processing.

(4) As normalizedString, and adjacent space characters are collapsed to a single space
character, and leading and trailing spaces are removed.

(5) The "g" prefix signals time periods in the Gregorian calendar.


Custom Simple Types



XML Schema language also permits defining custom simple types which are
extensions of the XML built-in simple types. In fact many of the 'built in' types
are extensions of more primitive built in types. Such extensions are called
restrictions, lists and unions. The following W3C schools description of the
'simpleType' element, shows that the 'restriction', 'list' or 'union' are at the
heart of the elements form.

Following is a form description of the simpleType element which is the base
element used to build custom simple types.
 

W3Schools Form Description of the simpleType element.

<simpleType
        id=ID                    // ID is optional, takes a unique identifier
        name=NCName   // only used if the simpleType is a child of the schema element
        any attributes        // optional, any other attributes
        >
        ( annotation?,( restriction | list | union ) )   // bolded for emphasis
</simpleType>

// NCName - a 'non-colonized' name, a rname without a prefix.


Restrictions

Custom simple types are enclosed in a <simpleType> element. In the case
or a restriction, a <restriction> element is used and declared to a 'base' 'built-in'
type on which this restriction will be configured.  In the following example,
the 'xs:integer' built-in type is used. We use elements called 'facets' to further
restrict the range of integers we want to select. In the next example the range
is limited to an integer between 100 and 999 inclusive.


Defining myInteger, Range 100-999

<xs:simpleType name="myInteger">
  <xs:restriction base="xs:integer">
    <xs:minInclusive value="1000"/>
    <xs:maxInclusive value="9999"/>
  </xs:restriction>
</xs:simpleType>

In an example from the W3C site, the base of the restriction is 'xs:string'
type and restricted by the 'pattern' facet to a range of uppercase characters
between A and Z inclusive. The W3C recommendation describes this type
as having been derived by restriction from the simple type 'string'.


Example

<xs:simpleType name="SKU">
  <xs:restriction base="xs:string">
    <xs:pattern value="[A-Z]"/>  // hyphen denotes a range
  </xs:restriction>
</xs:simpleType>

The XML Schema 'pattern' element uses a regular expression language which
includes support for Unicode and is described in
'XML Schema Part 2',
http://www.w3.org/TR/2001/REC-xmlschema-2-20010502/#regexs

One more example of a restriction based on the string built-in type is listed
below. Here the facet is the 'enumeration' type. Here the strings provided
are abbreviations for states. An 'enumeration' element  is used  to provide
a set of choices from which a single value can be selected.
 

Enumeration Facet Example

<xsd:simpleType name="USState">
  <xsd:restriction base="xsd:string">
    <xsd:enumeration value="AK"/>
    <xsd:enumeration value="AL"/>
    <xsd:enumeration value="AR"/>
    <!-- and so on ... -->
  </xsd:restriction>
</xsd:simpleType>


Facets




Using Facets in Restrictions

The above examples use 'minInclusive', 'maxinclusive', 'patterns' and 'enumeration'
elements to apply the limits to the restriction being created. Each of these is a
member of a special class of elements called 'facets'.  Facets are used to fine
tune the specification of a custom simple type. For those who like formal
statements, the following two statements are each a part of the official definition
of a facet and a value space.


Abbreviated W3C Formal Definition of a Facet  and a Value Space

Definition: A 'facet' is a single, defining aspect of a value space. A 'value
space' is in turn defined as the set of values for a given data type.

// paraphrase

A facet is a defining aspect of a value or set of values.


XML Schema Language defines 15 facets, 12 of which can be applied to
simple
types. Facets are themselves built-in simple element types. Facets
allow greater control over the specificity of definitions
for simple types.
Following is a bulleted list of the facets in alphabetical order.

 

XML Schema Facets

 // "zero or greater, + greater than zero 



Element Content



So far, we have seen many combinations of elements nested inside other

elements. The XML Primer goes on to describe three special cases. The
case where an element contains only character data and has attributes,
the case of the mixed type, where elements and character data represent
the combined content of a document and case where an element is defined
that has no content.

1) Declaring an element that has an attribute and contains a simple value

Declaring an element that has an attribute and contains a simple value sounds
simple. So what is the problem? The instance document might have an element
that looks like the following W3C example.
 

A tag in an instance document

<internationalPrice currency="EUR">423.46</internationalPrice>

As a starting point we attempt to create a simple type as follows.
 

A simple type example


<xsd:element name="internationalPrice" type="xs:decimal"/>

// can't add attribute to definition of a simple type

Here is the but! Simple types can't have attributes. Solution? The solution
is to derive a complex type that is based on simple content, using the
<simpleContent> element. In the next example the  xs:decimal type is
specified as an attribute of an 'extension' element The extension element
itself contains an attribute set to the 'string' type. The <complexType>
element is used to house the structure. In the following example, the
complexType element is in the 'anonymous' form being an inlined
sub-section of the 'internationalPrice' element. ( In ''Developing Java
Web Services" by R. Nagappan et. al  the 'anonymous' form is described
as 'implicit' or 'nameless'. )

 

Deriving a Complex Type from a Simple Type

 <xsd:element name="internationalPrice">    <!-- no type supplied in tag-->
     <xsd:complexType>
        <xsd:simpleContent>
           <xsd:extension base="xsd:decimal">
              <xsd:attribute name="currency" type="xsd:string"/>
           </xsd:extension>
       </xsd:simpleContent>
     </xsd:complexType>
 </xsd:element>



Summary of Steps
// to create an element that holds a simple value and has an attribute


1) Inside an element nest a complexType tag.
2) Nest a simpleType content tag to describe content.
3) Use an extension specify the base.
4) Specify in an attribute tag the attribute name and it's type.

// A named comptexType tag of this variety would also be possible


2) How to create elements that support mixed content

Notice in the following W3C example, the text appears between elements
and their child elements. The form 'inlines' text, elements and sub-elements.

Example

<letterBody>
<salutation>Dear Mr.<name>Robert Smith</name>.</salutation>
Your order of <quantity>1</quantity> <productName>Baby
Monitor</productName> shipped from our warehouse on
<shipDate>1999-05-21</shipDate>. ....
</letterBody>


Following is the schema that make this xml possible. The key to making
the mixture possible is the use of the mixed attribute (highlighted). Setting

the 'mixed' attribute to 'true' allows character data to appear between child
elements.

Example

<xsd:element name="letterBody">
 <xsd:complexType mixed="true">
 <xsd:sequence>

   <xsd:element name="salutation">

    <xsd:complexType mixed="true">
     <xsd:sequence>
      <xsd:element name="name" type="xsd:string"/>
         </xsd:sequence>
        </xsd:complexType>
       </xsd:element>

   <xsd:element name="quantity"    type="xsd:positiveInteger"/>

   <xsd:element name="productName" type="xsd:string"/>
   <xsd:element name="shipDate"    type="xsd:date" minOccurs="0"/>
   <!-- etc. -->

  </xsd:sequence>

 </xsd:complexType>
</xsd:element>


The XML Primer notes the mixed model in XML Schema is  fundamentally different
from
the DTD mixed model used in XML 1.0. Under the XML Schema mixed model,
the order and number
of child elements appearing in an instance must agree with the
order and number of child
elements specified in the model. In contrast, under the XML
1.0 mixed model, the order
and number of child elements appearing in an instance
cannot be constrained. This means,
the XML Schema provides full validation of
mixed models while XML 1.0 provided only
a partial schema validation.

The following example from the W3Schools web site shows an element from an
XML instance that adheres to the mixed content schema dictated in the element
declaration that follows.

Example from W3Schools Website

<letter>
Dear Mr.<name>John Smith</name>.
Your order <orderid>1032</orderid>
will be shipped on <shipdate>2001-07-13</shipdate>.
</letter>

Corresponding Schema

<xs:element name="letter">
  <xs:complexType mixed="true">
    <xs:sequence>
      <xs:element name="name" type="xs:string"/>
      <xs:element name="orderid" type="xs:positiveInteger"/>
      <xs:element name="shipdate" type="xs:date"/>
    </xs:sequence>
  </xs:complexType>
</xs:element>

 

Empty Content

Now suppose that we want the internationalPrice element to convey both the
unit of currency and the price as attribute values as in the following example.

Example     <internationalPrice currency="EUR" value="423.46"/>

Such an element has no content at all; its content model is empty. To define
a type whose content is empty, we use a type that first disallows anything
but elements in it's
content and then goes on to prevent any elements from
being added. This way the type's
content model is left empty.
 

An Empty Complex Type

<xsd:element name="internationalPrice">
 <xsd:complexType>
  <xsd:complexContent>           // complexContent yet no elements defined
   <xsd:restriction base="xsd:anyType">
    <xsd:attribute name="currency" type="xsd:string"/>
    <xsd:attribute name="value"    type="xsd:decimal"/>
   </xsd:restriction>
  </xsd:complexContent>
 </xsd:complexType>
</xsd:element>

But, the complexContent element with the restriction to 'anyType'
is the default form so these elements may be eliminated to create
the more natural form that follows.

// 'anyType' is the primordial root type of the schema data types.

Shorthand for an Empty Complex Type

<xsd:element name="internationalPrice">
 <xsd:complexType>
  <xsd:attribute name="currency" type="xsd:string"/>
  <xsd:attribute name="value"    type="xsd:decimal"/>
 </xsd:complexType>
</xsd:element>



Lists & Unions



The XMLs Concept of a List Type

XML Schema makes use of the concept of a 'list' type. List types are categorized
as simple types because they are comprised of sets of 'atomic' types. Atomic types
are the simple types and are considered indivisible. For instance, the name token,
or NMTOKEN value "US" is an atomic type. There are no intended sub-units of "US"
such as "U" or "S".

List types are represented as a white-space separated sequence of atomic types.
Following we rephrase the form element for 'simpleType' without the attribute
descriptions to emphasize that 'simpleType' may include a list element.


Simplified simpleType Form

<simpleType >   (annotation?,( restriction | list | union ))  </simpleType>
 

Creating New List Types

You can create new list types by derivation from existing atomic types.
(You cannot however create list types from existing list types, nor from
complex types.) Following is the W3C example of a list of integer values.

// implies you cannot extend an extension
 

W3C Example of a List of 'myInteger', Custom Integer Types

<xsd:simpleType name="listOfMyIntType">
 <xsd:list itemType="myInteger"/>
</xsd:simpleType>

The subsequent example shows that a conforming XML instance element
can contain a space separated list of this element's type.


W3C Example of that Conforms to the Above list Type Definition

<listOfMyInt> 20003 15037 95977 95945 </listOfMyInt>
 

Union Types

While a list defines an aggregation of several similar type values, a union adds
a level of complexity allowing  the creation of types that contain multiple atomic
types which may include list types. A union type is always a derived type and
by definition must be made up of at least two different 'member' types.

The following example of a union type from W3Schools. Notice the union type
element is composed from the two following simple type definitions, 'sizebyno
and 'sizebystring'.
 

W3Schools Example

<xsd:element name="jeans_size">
<xsd:simpleType>
<xsd:union memberTypes="sizebyno sizebystring" /> // notice the space separated types
</xsd:simpleType>
</xsd:element>

<xsd:simpleType name="sizebyno">
<xsd:restriction base="xsd:positiveInteger">
<xsd:maxInclusive="42"/>
</xsd:restriction>
</xsd:simpleType>

<xsd:simpleType name="sizebystring">
<xsd:restriction base="xsd:string">
<xsd:enumeration value="small"/>
<xsd:enumeration value="medium"/>
<xsd:enumeration value="large"/>
</xsd:restriction>
</xsd:simpleType>

In the above examples, legal values for the union type are an integer value from
1 to 42, inclusive, or one of either "small", "medium" or "large".



All, Choice & Sequence Groups



Recall the <complexType> element required a 'compositor' element, one of
'sequence', 'all' or 'choice'. Compositors get their name from the fact that
they create groups of elements.


Choice Groups

XML Schema language offers the ability to choose which element is shown
in an
instance document using the 'choice' element. The choice group element
allows only
one of its children to appear in an instance. (Note 'choice' allows
selecting between child elements while the earlier 'enumeration' element was
used in simpleTypes to provide a choice of values for the type.) Following is
an example that allows the choice of either an 'Air', 'Rail' or 'Sea' element.


Example

<xs:element name ="carrier" >
<xs:complexType >

     <xs:choice>
              <xs:element ref`="Air" />
              <xs:element ref="Rail"/>

              <xs:element ref="Sea"/>
    
     </xs:choice>

  <xs:complexType>
</xs:element>
. . . .

      

The  'all'  Element

The <all> element  allows elements to occur in any order.  The elements in
an all group appear as dictated by minOccurs and maxOccurs. In the default
scenario, where both attributes are set to ' 1 ', all elements must appear though
in any order. If 'minOccurs' is set to zero in an element, this makes this element
optional. The 'maxOccurs' attribute cannot be greater than ' 1 '. In other words,
no element in the content model can appear more than once.

The following W3C example would permit child elements to appear in any order,
with the comment type appearing optionally. 

An all group Example

<xsd:complexType name="PurchaseOrderType">
  <xsd:all>
    <xsd:element name="shipTo" type="USAddress"/>
    <xsd:element name="billTo" type="USAddress"/>
    <xsd:element ref="comment" minOccurs="0"/>
    <xsd:element name="items"  type="Items"/>
  </xsd:all>
  <xsd:attribute name="orderDate" type="xsd:date"/>
</xsd:complexType>

 

The attributeGroup Element

It is often convenient to group a set of attributes together and then
reference the group into an element. The following attribute group
shows three attributes, that together form a complicated data structure.
Bundling them into a group that can be used in different sorts of similar
elements may provide a convenient way to keep schema code readable.

<!-- attributeGroup replaces a set of individual declarations -->


W3C Example of an attributeGroup Element Definition

<xsd:attributeGroup name="ItemDelivery">

<!-- an  attribute based on a reference to an externally defined attribute type -->
  <xsd:attribute name="partNum"  type="SKU" use="required"/>

<!-- a regular simple attribute definition -->
  <xsd:attribute name="weightKg" type="xsd:decimal"/>

<!-- a custom attribute type based on a restriction to an enumeration -->
  <xsd:attribute name="shipBy"> 
    <xsd:simpleType>

     <xsd:restriction base="xsd:string">
      <xsd:enumeration value="air"/>
      <xsd:enumeration value="land"/>
      <xsd:enumeration value="any"/>
     </xsd:restriction>
    </xsd:simpleType>
  </xsd:attribute>

</xsd:attributeGroup>

 
The line that references the definition below into an element reuses
the attributeGroup element in conjunction with the 'ref' attribute.

W3C Excerpt Showing an Attribute Group Being Referenced
. . .
<xsd:attributeGroup ref="ItemDelivery"/>
. . .

Annotations

XML Schema provides three elements for commenting schemas for human or
application information.

The annotation element -The Annotation tag is the parent element of the
documentation and the appInfo elements. The 'documentation' and 'appInfo'
elements are nested inside a annotation element.

The documentation element - The documentation element is recommended
for providing humanly readable material. It is also recommended that the xml:lang
attribute is used to indicate the language of this information. You may also indicate
the language of all information in a schema by placing an xml:lang attribute on the
schema element.

Example <schema xml:lang="en">   //  language for all schema information

The appInfo element - The appInfo element provides information for applications
that may be associated with processing the XML document such as stylesheets or
graphics programs.  

The annotation will often appear at the beginning of a schema construction. Following
is a W3C example that shows annotation elements with enclosed documentation
elements used to comment the internationalPrice element. The comments are offset
with italics to show more clearly the role they play in the tag.
 

Annotations in Element Declaration & Complex Type Definition

<xsd:element name="internationalPrice">

 <xsd:annotation>
  <xsd:documentation xml:lang="en">
      element declared with anonymous type
  </xsd:documentation>
 </xsd:annotation>

 <xsd:complexType>

  <xsd:annotation>
   <xsd:documentation xml:lang="en">
       empty anonymous type with 2 attributes
   </xsd:documentation>
  </xsd:annotation>

  <xsd:complexContent>
   <xsd:restriction base="xsd:anyType">
    <xsd:attribute name="currency" type="xsd:string"/>
    <xsd:attribute name="value"    type="xsd:decimal"/>
   </xsd:restriction>
  </xsd:complexContent>
 </xsd:complexType>
</xsd:element>



  Self Test                                                                  Self Test With Answers


1) True or False? The root element of a schema document is the schema
    element.                                                                     True \ False          


2) True or False? A simple Element can contain text and attributes.
  True \ False                                                                         


3) 
True or False? The default value for minOccurs and maxOccurs is 1.
    True \ False                                                                       


4)  True or False? Attributes, like simple elements, use minOccurs and
     maxOccurs to control occurences.                                    


5) True or False? Default values only make sense if attributes are optional.
   True \ False                                                                   



6)  True or False? The Union element can be composed of different sorts of
     simple and complex types.   True \ False                           


7) True or False? Declaring an element that has an attribute and contains a
    simple value would require something like the following. True \ False

<xsd:element name="message" type="xsd:string"/>              



8) What is the attribute that allows an element to support mixed content?
    ________



Exercise



Create a schema to govern a client record for a company.

The record itself will be a complex element with attributes
that specify a date the record was created, and an attribute
that holds a unique identifier. ( This might be created as
an attribute element that is typed to the built in 'ID' type.)

<>Nested in the record element will be a complex element that
contains a sequence of elements representing a client's first
name, an initial and a last name. Make the initial an optional
element utilizing the 'minOccurs' attribute.

Create a second complexType element that holds address
information. This element should contain elements that
reference elements named 'street', 'city', 'country' and 'postalCode'. 

The 'street' and 'city' element will be a simple element that uses
the built in schema string type. The 'country' element will be a
simple type derived by restriction that allows a enumeration of
abbreviated tokens limited to countries in North America.
The postalCode element will be a complexType element that
offers a choice of two simple elements that represent ZIP or
POSTAL code.

// You can do the ZIP and POSTAL code as simple string types
// or if you wish use a derived simpleType and the pattern facet
// to limit characters to those appropriate for each format.

Create an element using a 'all' element to classify this client
as a cash customer, an private account holder, a corporate
representative or all three. If a corporate representative, the
name of the company should be specified.

// an all group with all elements marked optional including
// company name would allow combinations of elements
// to be specified.

Use schema shells described earlier in the note to complete your
schema document. Create an instance document that adheres to
your schema definition and validate the document using any of the
validation methods suggested.

If this is all mystifying and you are quite new to XML you may
use the following summary to help you organize this assignment.


Summary of Requirements

Client_Record   date_attribute     ID_attribute    // root element

    name element   
         // a complex type
                sequence
                       first name
                          initial   // optional
                            
  last name
                                    
// closing tags

       address element
                    sequence
                         element ref="street"
                            element ref="city"
                               element ref="country"
                                  element ref="postalCode"            

                        element street
                            element city
                               element country         // simple derived type based on enumeration facet
                                  element postalCode    // choice of two simple or derived simple types        
                                          // closing tags     

            clientType
                      all
                          cash
                              private account
                                    corporate account
                                           companyName      
                                                     // closing tags