WS_Schema

XML Schema Peter Komisar latest version .5.1 _©

references: The XML Schema Primer http://www.w3.org/TR/xmlschema-0, The XML
Schema Structures (Part 1) http://www.w3.org/TR/xmlschema-1, 'XML Schema Part 2:
Datatypes', www.w3.org/TR/2001/REC-xmlschema-2-20010502/, Examples from
W3Schools Web site, www.w3Schools.com, 'XML and Web Services Unleashed', R.
Schmelzer et.al., 'XML in a Nutshell', E.R Harold & W.S. Means, Professional XML
Schemas, J.Duckett et.al.

Overview

In May, 2001 the W3C published their recommendation for the XML Schema Definition
Language. The specification allows the creation of different sorts of simple and complex
XML elements that govern typing in an associated XML 'instance' document . The
recommendation also introduces a large set of data types that allow data in an XML
document to be strongly typed.

Official Documents

In addition to general information that is available at http://www.w3c.org/TR/Schema,
there are three documents frequently cited as sources and official documents for the
XML schema recommendation. They are listed below.

The XML Schema Primer (Part 0) -http://www.w3.org/TR/xmlschema-0/
The XML Schema Structures (Part 1) -http://www.w3.org/TR/xmlschema-1/
The XML Schema Structures (Part 1) -http://www.w3.org/TR/xmlschema-2/

What is a Schema?

Definition: A schema is a set of rules that is used to govern data structure and content.

The schema Element

The <schema> Element

The root element of an XML schema document is the 'schema' element.
The schema element houses the set of schema elements that are used
to create an XML schema definition.

The following example shows a 'skeleton' schema. Notice in this example
the official namepace for the World Wide Web's consortium's XML schema
language is specified and assigned to the namespace prefix 'xs'. This is a
hallmark of a schema document. Also shown is the 'targetNamespace'
attribute which holds a URI that represents the namespace that will be
associated with this particular schema.

A Skeleton Form of the 'schema' Element

<xs:schema
xmlns:xs="http://www.w3.org/2001/XMLSchema"
// schema language namespace
targetNamespace="http://www.example.com/example">
// namespace that this schema will be associated with
. . .
</xs:schema>

A complete schema may be represented in XML by one or more 'schema documents',
that is, one or more, <schema> element, information items.

// points out that a schema can be comprised of a { set } of schema documents

Complete Schema Form // reference

// the 'official' 'schema' element form published in the W3C schema ,
// http://www.w3.org/TR/xmlschema-1/#Schemas

Form of the schema Element

// quoted from the W3C Schema 1 document the http://www.w3.org/TR/xmlschema-1/#Schemas

<schema
attributeFormDefault = (qualified | unqualified) : unqualified
elementFormDefault = (qualified | unqualified) : unqualified
blockDefault = (#all | List of (extension | restriction | substitution)) : ''
finalDefault = (#all | List of (extension | restriction)) : ''
// the single quotes form an empty string and provide the default value
// the first four attributes are global to a document and control various default

id = ID
version = token
// id & version attributes are optional and "for user convenience"

targetNamespace = anyURI
// associates schema element element definitions with a namespace

xml:lang = language
{any attributes with non-schema namespace . . .}>

Content: ((include | import | redefine | annotation)*, // describes different types the document may hold
(((simpleType | complexType | group | attributeGroup) | element | attribute | notation), annotation*)*)
</schema>

The schema element is a skeleton container for it's components which may be sets of:

// may include imported definitions and nested elements

simple & complex, type definitions // <complexType> and <simpleType> elements
'top-level' attribute declarations // <attribute> elements
'top-level' elements // <element> elements
attribute groups // <attributeGroup>
model group definitions // <group>
notation elements // <notation>
annotation information items // <annotation>

// <include> and <import> elements are not schema constituent components but
// rather are directives that serve the same purpose as in other programming dialects
// to bring in definitions that are defined externally to the local document

XML Schema Namespaces

Schema has three namespaces that are 'hallmarks' of the application and will
typically be present in schema and schema instances.

"http://www.w3.org/2001/XMLSchema" // xmlns:xs="http://www.w3.org/2001/XMLSchema" or with the xsd prefix

The XMLSchema namespace is used to identify elements that are part of the schema application
itself. These are the widgets XML Schema language uses such as the <schema>, <element>,
<attribute> and <complexType>. This namespace definition appears in the schema document.

"http://www.w3.org/2001/XMLSchema-instance" // xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"

The XMLSchema-instance namespace is found in XML instance documents The XML
Schema Instance namespace is most often used to qualify the 'xsi:schemaLocation'
attribute or the 'xsi:noNamespaceSchemaLocation' attribute. It is also used to specify a
'nil' value, in the form xsi:nil. This is a form that explicitly states there is no given entry
for a field. (This differentiates from the default value that an empty string supplies, which
is an empty string which may be an incorrect type for a given element. ) The Schema
instance namespace is also used with the 'xsi:type' attribute to specify a derived type
that a schema defines.

Following is an example of how the xsi:nil attribute is used

W3C Example <xs:element name="shipDate" type="xs:date" nillable="true"/>
// boolean attribute 'nillable' set to true

A subsequent 'shipDate' tag in a document instance can then have it's 'nil' value set
to true.

W3C Example <shipDate xsi:nil="true"></shipDate>

// A third namespace, xmlns:sxdatatypes="http:www.w3.org/2001/XMLSchema-datatypes
// is defined that may be used in place of the general XMLSchema namespace for qualifying
// built in data types. It is a specialized namespace that could be used to represent Schema
// data types in applications other than XML schema.

Schema Samples

The following W3Schools example is a skeleton of more typical form that the
schema tag might take.

Example 1 From the W3Schools Website

<?xml version="1.0"?>
<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema"

targetNamespace="http://www.w3schools.com"

xmlns="http://www.w3schools.com"

elementFormDefault="qualified">



</xs:schema>

Following is an example of a simple schema and a corresponding document
instance that has been verified against the latest standards. ( This time we
have a simple element included. )

Example 2 // save to SomeName.xsd

<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema"
           targetNamespace="www.example.com"
           elementFormDefault="qualified"

           >
    <xs:element name="elem1" type="xs:string"/>
</xs:schema>

Next we show the corresponding 'instance' document.

Example 2 XML Instance for the above Schema // was saved as SomeName.xml

<?xml version="1.0"?>

<elem1 xmlns="www.example.com"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="www.example.com SomeName.xsd">
It's Schema Time!
</elem1>

// again note the default namespace provided bring the instance into
// 'qualification' with the target namespace specified

While just serving as identifiers, the W3C consortium wish you would use real
URIs. To this end some have been reserved for the purposes of XML. They
are listed b below.

Reserved Pages http://www.example.com
http://www.example.net
http://www.example.org

Following is a complete W3Schools Example of a Short XML Schema.
By convention, this file would be saved to a file with an .xsd extension,

// Like DTDs, schema has it's own file extension by convention. Unlike
// DTD's schemas are 'real' xml documents, written in XML.

A Short Schema Example From W3Schools

<?xml version="1.0"?> 
<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema"
targetNamespace="http://www.w3schools.com"
xmlns="http://www.w3schools.com"
elementFormDefault="qualified">

<xs:element name="note">
<xs:complexType>
<xs:sequence>
<xs:element name="to" type="xs:string"/>
<xs:element name="from" type="xs:string"/>
<xs:element name="heading" type="xs:string"/>
<xs:element name="body" type="xs:string"/>
</xs:sequence>
</xs:complexType>
</xs:element>

</xs:schema>

( If you are going to test run this code, make sure you use the same
name as is referenced in the instance document. Below it is called
note.xsd )

The XML Instance Document

Following is the W3Schools document that references the above XML Schema.

A W3Schools XML Example Implementing the Schema

This XML document has a reference to an XML Schema:

<?xml version="1.0"?>
<note
xmlns="http://www.w3schools.com"                               
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" 
xsi:schemaLocation="http://www.w3schools.com     note.xsd"> 

<to>Tove</to>
<from>Jani</from>
<heading>Reminder</heading>
<body>Don't forget me this weekend!</body>
</note>

Notes on the Instance Document

Inside the instance document's root element, <note>, there are two namespace
assignments and a 'schemaLocation' attribute.

1) The instance document has created the 'xsi' namespace prefix that represents
     the special W3C namepace, "http://www.w3.org/2001/XMLSchema-instance"
     used in conjunction with the 'schemaLocation' attribute.

     (Alternatively, there is a 'noNamespaceSchemaLocation' attribute that can be
      used when no namespace has been declared in the schema document. )

Example xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"

2) Both the schema and the instance have created a default namespace
for their documents.

Example xmlns="http://www.w3schools.com"

Effectively, this makes all elements which are not prefixed, 'qualified' and
belonging to the default namespace. (This keeps it in sync with the assignment,
elementFormDefault="qualified" , found in the corresponding schema.)

3)    Once created, we use the 'xsi' namespace prefix in conjunction with the
      'schemaLocation' attribute to specify the actual location of the schema is
      entered. This is shown in the example below that extracts the two lines
      that show this aspect of the instance document. ( The prefix is highlighted.)

Example

xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://www.w3schools.com note.xsd">

4) The 'schemaLocation' element is specified in two parts. This can be seen
in the second line of the above example. This points to what might seem
      at first glance to be a slightly ambiguous part of the Schema Language
      specification. Although it would appear that this is a path / file format, the
      XSL specification does not require that these identifiers are related. For
      example, the following example demonstrates that the two values do not
      have to be related. It is really up to the parser to interpret these identifiers.

Example xsi:schemaLocation="http://www.example.net note.xsd">

      The actual XML Schema specification states that the first name should
     "hint" as to where the parser can find the schema. It is reasonable to expect
that the first identifier in some way suggests where the 'note.xsd' file will
      be found but there is no obligation to be so accommodating!

Global Elements & Attributes // global is the scope of the schema element

Global elements are children of the 'schema element'. Because attributes
are themselves elements, the same can be said about attributes. Attributes
are global when they are children of the schema element.

Global element or attributes can be referenced using the 'ref' attribute.
The following is an incomplete example that emphasizes the use of the
'ref' attribute.

<>Example        . . . .
                    <xs:element ref="History">
                    // element references global 'History' element
                    . . . .
                    <xs:element name="History" >
                         <xs:complexType>
                               <xs:sequence >
                       // etc.

Where to go to validate

Java Validating Parsers

1) The Java world now supports a unifying architecture for parsing and transforming
XML documents, called JAXP or the Java API for XML Processing. Within this
context several parsers can be made available that are able to validate schema.
Because JAXP is our next topic we will leave looking at validating with the JAXP
API to the next note.

DecisionSoft's Online Validation Tool

2) If you don't have the the Java JDK or Xerces on your machine you can access
the same functionality by using 'DecisionSoft's online tool that runs your documents
against the Xerces parser for you. Go to the following link and using the browsing
functions to find your files.

DecisionSoft's Schema Validation Page

http://tools.decisionsoft.com/schemaValidate.html

Microsoft's XML Validation Page

3) Microsoft has an XML validation page, (which most of you have found will add
    a link for later. ) It is important that you are using IE6 or have downloaded the
    MSXML 4.x package which supports the latest version of the W3C's standards.
    The MSN XMLValidator is available at the Microsoft Developer Network.

http://msdn.microsoft.com/downloads/samples/internet/xml/xml_validator/default.asp?frame=true

XMLWriter

4) XMLWriter, as was pointed out in class is based on Microsoft's XML engine so
for it to do the most recent forms of validation, it too must have the MSXML 4.x
package present.

XMLSpy

5) XMLSpy is an all-in-one dedicated XML application that is a leader in the field. IBM also
supplies validation parsers which can be found at their web site.

Complex & Simple Types

Simple & Complex Types

An XML document will consist of a main element and sub-elements.
Sub-elements may in turn contain other sub-elements. In XML schema
descriptions, elements that contain sub-elements or carry attributes are
generic complex types. Elements that contain primitive data, numbers,
strings, dates etc.) and that do not contain any sort of sub-element are
considered generic simple types. Attributes themselves are represented
as simple types. It follows then that only complex types can have attributes.

// If the element will just hold primitive data --> they are simple type
// If an element has attributes or other elements --> then they are complex types

A <complexType> element is involved in creating a generic complex type
element. There is a parallel situation that exists for generic simple types
where a <simpleType> element is used.

We begin by describing the simple element type that is used to create
basic definitions in our schema specifications.

Simple Elements // contains text data and not other elements or attributes

A simple element is an XML element can only contain text. It cannot contain
any other elements or attributes. "Only text" in this context means one of a
number of different types, whether 'built-in' types like 'token', 'string' or 'boolean'.
Custom types or types derived from built-in basic types can also be used to
specify content type. Following is the base syntax for a simple element.

Simple Element Form <xs:element name="xxx" type="yyy"/>

Following are three simple elements as they might appear nested inside
an XML Schema document.

Simple Element Examples
         . . . .
        <xs:element name="treeType" type="xs:string" />
        <xs:element name="leafPoints" type="xs:integer"/>
        <xs:element name="idDate" type="xs:date" />
         . . . .

The above schema definitions would map to corresponding instance
element such as the following.

Corresponding Elements in Instance Document

         . . . .
        <treeType> Basswood </treeType>
        <leafPoints> 1       </leafPoints>
        <idDate> 2005-06-01 </idDate>
         . . . .

// We will see late that simple elements can be further 'extended' or 'restricted'
// to create custom sets of simple types, using the 'simpleType' element.

Complex Elements

Several XML schema elements work together to create complex types. Complex
types may contain nested elements and attributes. They may also be formulated
to create a 'mixed' form that contains elements mixed with text. The complex type
is hallmarked by the presence of the <complexType> tag.

The complexType element - The <complexType> element acts as a container
for a set of elements which may include attribute declarations. The following
skeleton example is simplified by leaving out nested elements and shows a
named version of the element called 'skeleton'. This name can be assigned
to the type attribute of an element. ( Note the 'xsd' prefix is an alternative to the
'xs' prefix that is conventionally used in XML schema namespace declarations.
It is popularly being replaced by the shorter 'xs' prefix which you also see in
many examples.)

The following example is also referred to as a complex type declaration.

Example of a Named Form of a Complex Type Declaration

<xsd:complexType name="skeleton" >

</xsd:complexType>

The above example is the named form of the complexType element, the name
here is "skeleton". This allow the type to be used as a template that can be
referenced at some other point in the schema document as is shown is the
second part of the example below. Notice the 'type' attribute is used to specify
what the type of the element named 'PostalExtension' will be.

Complex Element Example

// elsewhere in the schema

The Anonymous or Unnamed Form

Elements are frequently composed with anonymous or 'inlined' 'complexType'
elements nested inside them. The following example from 'XML in a Nutshell'
shows this form. Notice, the <complexType> element in the example has no
name.

Abbreviated Example of the Anonymous Form of a Complex Element

// from 'XML in a Nutshell', Harold & Means

<xs:element name="fullName">
<xs.complexType> // here the complexType is not named and is 'inlined' into another element
<xs:sequence>
<xs:element name="first" type="addr:nameComponent"/>
<xs:element name="lastt" type="addr:nameComponent"/>
</xs:sequence>
</xs.complexType>
</xs:element >

The next example shows an inlined complexType element that is made
up of simple elements that use built-in schema types.

Example 2

<xs:element name="root">
   <xs:complexType>
       <xs:sequence>
        <xs:element name="treeType" type="xs:string" />
        <xs:element name="leafPoints" type="xs:integer"/>
        <xs:element name="idDate" type="xs:date" />
       </xs:sequence>
   </xs:complexType>

    </xs:element>

The 'element' element - In many examples so far we have seen the <element>
element. It is used both as a nesting unit and as a base to house different
complex types. Consider the inlined example above where the <complexType>
element is wrapped inside a named element while at the same time there are
nested simple type elements.

Example of an Element Element

<xsd:element name="name" type="xsd:string"/>

The 'attribute' element - The 'attribute' element is used to create attribute values
for the 'complexType'. In the following attribute example, the attribute value is fixed
and must be "US". Notice that the attribute type is a predefined, XML schema
simple type called NMTOKEN. All attribute declarations must reference simple
XML data types. (As just mentioned we will see how we custom define simple
types later.) Unlike element declarations, attributes must be simple and cannot
contain other elements or other attributes.

Example of an Attribute Element

<xsd:attribute name="country" type="xsd:NMTOKEN" fixed="US"/>

By looking at the following example of a complex type you can see all three
element types used in conjunction with each other. The example includes
the 'sequence' element which contains an ordered grouping of elements.
The elements grouped in a sequence must appear in the exact order that
they are specified within the 'sequence' element of the schema. This has
an effect equivalent to an 'Element Only' definition in a DTD.

W3 Example of a ComplexType Declaration // from the W3C example

<xsd:complexType name="USAddress" >
<xsd:sequence>
   <xsd:element name="name"   type="xsd:string"/>
   <xsd:element name="street" type="xsd:string"/>
   <xsd:element name="city"   type="xsd:string"/>
   <xsd:element name="state" type="xsd:string"/>
   <xsd:element name="zip"    type="xsd:decimal"/>
</xsd:sequence>
<xsd:attribute name="country" type="xsd:NMTOKEN" fixed="US"/>
</xsd:complexType> 

This definition can then be referenced in another tag from inside the schema
document. For instance, in the W3C Schema example at the end of the note,
the above defined type is referenced as follows.

Complex Type Declaration Referenced Later in Schema

<xsd:element name="shipTo" type="USAddress"/>

The specification that is dictated by this schema element is adhered to by
the corresponding element in the schema document. ( This code can be found
in the second part of the W3C example at the end of the note.)

Valid Form of Element Referencing Complex Type in Subsequent Instance Document

<shipTo country="US">
<name>Alice Smith</name>
<street>123 Maple Street</street>
<city>Mill Valley</city>
<state>CA</state>
<zip>90952</zip>
</shipTo>

Occurrence Constraints For Elements

Occurrence constraints decide exactly how many times an element can occur.
This is is an improvement over DTDs which can limit occurrences of elements
to zero, one or many.

The following table captures a good comparison of DTD cardinality controls
compared to those used in XML Schema.

Table Comparing DTD Cardinality Controls with Values of minOccurs & maxOccurs

// facsimile of a similar table found in 'Professional XML Schemas', J.Duckett et.al, Wrox Press

DTD Cardinality Symbols	minOccurs value	maxOccurs value	Element Occurences
none // default	1	1	Once & only once
?	0	1	Zero or one
*	0	unbounded	Zero or more
+	1	unbounded	One or more

// unbounded is assigned as a literal value inside double quotes i.e. maxOccurs="unbounded"

Schema Cardinality Controls

'minOccurs' & 'maxOccurs' - The attributes that control element occurrences
'minOccurs' and 'maxOccurs'. The default value for both the 'minOccurs' and
the 'maxOccurs' attributes is 1. With 'minOccurs' at the default value, the element
is required to appear at least once. With maxOccur set to the default value of 1,
the element is only allowed to appear at most only once. 'maxOccurs' can be set
to whatever value is appropriate. By setting minOccur to 0, the element because
optional. The next example explicitly states that this element is optional.

Example <xsd:element ref="comment" minOccurs="0"/>

'unbounded' - The 'unbounded' value can be used with 'maxOccurs' to emulate
the behaviour found in DTDs when * and + are used.

The following is a modification of the W3Schools example allowing the note to
specify up to 5 receivers. We have explicitly stated the default vale for the
'minOccurs' attribute.

W3Schools Schema Example Modified

<?xml version="1.0"?>
<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema"
targetNamespace="http://www.w3schools.com"
xmlns="http://www.w3schools.com"
elementFormDefault="qualified">

<xs:element name="note">
<xs:complexType>
<xs:sequence>
<xs:element name="to" type="xs:string" minOccurs="1" maxOccurs="5"/>
<xs:element name="from" type="xs:string"/>
<xs:element name="heading" type="xs:string"/>
<xs:element name="body" type="xs:string"/>
</xs:sequence>
</xs:complexType>
</xs:element>

</xs:schema>

Matching Schema Instance

<?xml version="1.0"?>
<note
xmlns="http://www.w3schools.com"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://www.w3schools.com Note.xsd">

<to>Jack</to>
<to>Jill</to>
<to>Elvis</to>
<to>Nephilum</to>
<to>Santa</to>


<from>Elmo</from>
<heading>Clarification</heading>
<body>Concerning Fairy Tales, Myths and Legends</body>
</note>

Attributes Behave Differently

Attributes, on the other hand, can appear once or not at all. They are controlled
with a different syntax. The 'use' attribute can be used with them to make their
appearance 'required', 'optional' or 'prohibited' which is exampled below`.

Attributes

Attribute Simple Form

Attributes have the same form as simple elements. A simple representation
of the form of attributes is as follows.

Simple Form of the XML Schema Attribute

<xs:attribute name="xxx" type="yyy"/>

The following, more imposing W3 representation indicates that attribute definitions
have many exotic features. They may be set to defaults or have fixed values.
Attributes may be qualified or unqualified. They may have a unique identifier,
a name and perhaps may reference another type definition. Their use may
be specified as optional, prohibited or required. They may be annotated
( documented) and may be custom composed using a <simpleType > element.

The W3 Representation of an attribute element // for reference

<attribute
default = string
fixed = string
form = (qualified | unqualified )
id = ID
name = NCName // An NCName as defined by XML-Namespaces
ref = QName
type = QName
use = (optional | prohibited | required) : optional
{any attributes with non-schema namespace . . .}>
Content: (annotation?, (simpleType?))
</attribute>

Example

<xs:attribute name="age" type="xs:positiveInteger" use="required"/>

Default Values for Elements and Attributes

// Attributes can have a default value OR a fixed value specified

Attribute Default Values

In XML Schema Language, there is an actual 'default' attribute which is used
to assign a default value to an attribute. Default values only make sense if
attributes are optional. In fact in XML Schema Language, it is an error to
specify a default for anything but an optional value.

Both 'attribute' elements and 'element' element have the 'default' attribute and
may use this attribute to provide a default value With attributes, the default
value will be assigned based on whatever is provided in the instance document.
If the attribute does not appear in the instance document, the schema processor
provides the default attribute value that was supplied in the schema.

In other words, a default value is automatically assigned to the attribute when
no other value is specified. In the following example the default value is "EN":

<xs:attribute name="lang" type="xs:string" default="EN"/>

// in the instance document, if the attribute wasn't provided it would default to "EN".

Element Default Values - When an element is declared with a default
value, it is given the value specified in the element's content area as found
in the instance document. If the element appears without content, (is empty),
the schema provides the value that is given to the default attribute. If the
element doesn't appear at all in the document, however, the schema doesn't
provide the element at all.

The W3C paper summarizes,

" Default attribute values apply when attributes are missing,
and default element values apply when elements are empty."

They might have added, but felt it went without saying, 'nothing is applied
if elements are missing'.

Creating Optional and Required Attributes

The 'use' attribute is used to control whether an attribute may, must or can't
appear in a schema governed document. All attributes are optional by default.
To explicitly specify that the attribute is optional, the "use" attribute is used as
is shown in the following W3Schools example.

W3Schools Example

<xs:attribute name="lang" type="xs:string" use="optional"/>

To make the attribute required the use attribute is assigned the 'required'
value.

Example

<xs:attribute name="lang" type="xs:string" use="required"/>

There is a third value that can be assigned to the 'use' attribute. This is the
'prohibited' value, indicating the attribute cannot appear at all in the parent
element.

// testing showed default didn't need to be specified in optional case, which may be
// interpreted as ' optional with no default specified'

The Fixed Attribute

Both attribute and element declarations use the 'fixed' attribute to 'fix' specific
values. We saw this with the country attribute which was declared with the fixed
value, "US". Consider the following example. The use attribute is not specified,
therefore, it has the default value which is 'optional'. Accordingly, if the country
appears it must have the value 'US'. If the country attribute doesn't appear the
schema processor will provide the value, 'US'.

// optional mixed with fixed means that the instance may or may
// not be specified however in both cases it will be the fixed value.

Example

<xsd:attribute name="country" type="xsd:NMTOKEN" fixed="US"/>

A fixed value is automatically assigned to the attribute. You cannot specify
another value. In the following W3Schools example the fixed value is "EN":

W3Schools Example

<xs:attribute name="lang" type="xs:string" fixed="EN"/>

As talked about earlier in other words, the idea of providing a fixed value, or,
providing a default value is mutually exclusive. It is an error to declare both
fixed and default attributes in an element.

Schema Data Types Are Used by Attribute & Simple Elements

We saw in the schema we looked at earlier, examples of simple types being
used in both elements and attributes. Simple types are like the primitive types
of the Java programming language. They are the atomic and prime types of the
XML Schema Language. For instance in the following two examples, the first
being an element and the second an attribute, we see the type decimal and the
type NMTOKEN being used.

Examples of Simple Types used in both Element and Attribute

<xsd:element name="zip" type="xsd:decimal"/>

<xsd:attribute name="country" type="xsd:NMTOKEN" />

Built in Schema Data Types

The following table lists the imposing collection of type definitions that are described
in the XML Schema Primer at hosted at the W3C site. Although the list is imposing,
you will note that to built an equivalent set of types in a standard programming
language would call for the creation of a library of classes to represent each of
the variations provided. XML does this at the primitive type level.

Simple Types Built In to XML Schema // http://www.w3.org/TR/xmlschema-0

Simple Type	Example (comma delimited)	Notes
string	Confirm this is electric
normalizedString	Confirm this is electric	3
token	Confirm this is electric	4
byte	-1, 126	2
unsignedByte	0, 126	2
base64Binary	GpM7
hexBinary	0FB7
integer	-126789, -1, 0, 1, 126789	2
positiveInteger	1, 126789	2
negativeInteger	-126789, -1	2
nonNegativeInteger	0, 1, 126789	2
nonPositiveInteger	-126789, -1, 0	2
int	-1, 126789675	2
unsignedInt	0, 1267896754	2
long	-1, 12678967543233	2
unsignedLong	0, 12678967543233	2
short	-1, 12678	2
unsignedShort	0, 12678	2
decimal	-1.23, 0, 123.4, 1000.00	2
float	-INF, -1E4, -0, 0, 12.78E-2, 12, INF, NaN equivalent to single-precision 32-bit floating point	2
double	-INF, -1E4, -0, 0, 12.78E-2, 12, INF, NaN equivalent to double-precision 64-bit floating point	2
boolean	true, false, 1, 0
time	13:20:00.000, 13:20:00.000-05:00	2
dateTime	1999-05-31T13:20:00.000-05:00 ( May 31st 1999 at 1.20pm Eastern Standard Time which is 5 hours behind Co-Ordinated Universal Time )	2
duration	P1Y2M3DT10H30M12.3S 1 year, 2 months, 3 days, 10 hours, 30 minutes, and 12.3 seconds	2
date	1999-05-31	2
gMonth	--05--, May	2, 5
gYear	1999	2, 5
gYearMonth	1999-02 the month of February 1999, regardless of the number of days	2, 5
gDay	---31, the 31st day	2, 5
gMonthDay	--05-31, every May 31st	2, 5
Name	shipTo , XML 1.0 Name type
QName	po.USAddress, XML Namespace QName
NCName	USAddress, XML Namespace NCName i.e. A QName without prefix and colon
anyURI	http://www.example.com/
language	en-GB, en-US, fr, valid for xml:lang as defined in XML 1.0
ID	XML 1.0 ID attribute type	1
IDREF	XML 1.0 IDREF attribute type	1
ENTITY	XML 1.0 ENTITY attribute type	1
ENTITIES	XML 1.0 ENTITIES attribute type	1
NOTATION	XML 1.0 NOTATION attribute type	1
NMTOKEN	XML 1.0 NMTOKEN attribute type	1
NMTOKENS	XML 1.0 NMTOKENS attribute type i.e. a whitespace separated list of NMTOKENs	1

Notes From the Table

(1) To retain compatibility between XML Schema and XML 1.0 DTDs, the simple types
ID, IDREF, IDREFS, ENTITY, ENTITIES, NOTATION, NMTOKEN, NMTOKENS
should only be used in attributes.

(2) A value of this type can be represented by more than one lexical format, e.g. 100 and
1.0E2 are both valid float formats representing "one hundred". However, rules have been
established for this type that define a canonical lexical format, see XML Schema Part 2.

(3) Newline, tab and carriage-return characters in a normalizedString type are converted to
space characters before schema processing.

(4) As normalizedString, and adjacent space characters are collapsed to a single space
character, and leading and trailing spaces are removed.

(5) The "g" prefix signals time periods in the Gregorian calendar.

Custom Simple Types

XML Schema language also permits defining custom simple types which are
extensions of the XML built-in simple types. In fact many of the 'built in' types
are extensions of more primitive built in types. Such extensions are called
restrictions, lists and unions. The following W3C schools description of the
'simpleType' element, shows that the 'restriction', 'list' or 'union' are at the
heart of the elements form.

Following is a form description of the simpleType element which is the base
element used to build custom simple types.

W3Schools Form Description of the simpleType element.

<simpleType
        id=ID                    // ID is optional, takes a unique identifier
        name=NCName   // only used if the simpleType is a child of the schema element
        any attributes        // optional, any other attributes
        >
        ( annotation?,( restriction | list | union ) )   // bolded for emphasis
</simpleType>

// NCName - a 'non-colonized' name, a rname without a prefix.

Restrictions

Custom simple types are enclosed in a <simpleType> element. In the case
or a restriction, a <restriction> element is used and declared to a 'base' 'built-in'
type on which this restriction will be configured. In the following example,
the 'xs:integer' built-in type is used. We use elements called 'facets' to further
restrict the range of integers we want to select. In the next example the range
is limited to an integer between 100 and 999 inclusive.

Defining myInteger, Range 100-999

<xs:simpleType name="myInteger">
<xs:restriction base="xs:integer">
<xs:minInclusive value="1000"/>
<xs:maxInclusive value="9999"/>
</xs:restriction>
</xs:simpleType>

In an example from the W3C site, the base of the restriction is 'xs:string'
type and restricted by the 'pattern' facet to a range of uppercase characters
between A and Z inclusive. The W3C recommendation describes this type
as having been derived by restriction from the simple type 'string'.

Example

<xs:simpleType name="SKU">
<xs:restriction base="xs:string">
<xs:pattern value="[A-Z]"/> // hyphen denotes a range
</xs:restriction>
</xs:simpleType>

The XML Schema 'pattern' element uses a regular expression language which
includes support for Unicode and is described in 'XML Schema Part 2',
http://www.w3.org/TR/2001/REC-xmlschema-2-20010502/#regexs

One more example of a restriction based on the string built-in type is listed
below. Here the facet is the 'enumeration' type. Here the strings provided
are abbreviations for states. An 'enumeration' element is used to provide
a set of choices from which a single value can be selected.

Enumeration Facet Example

<xsd:simpleType name="USState">
<xsd:restriction base="xsd:string">
    <xsd:enumeration value="AK"/>
    <xsd:enumeration value="AL"/>
    <xsd:enumeration value="AR"/>
    
</xsd:restriction>
</xsd:simpleType>

Facets

Using Facets in Restrictions

The above examples use 'minInclusive', 'maxinclusive', 'patterns' and 'enumeration'
elements to apply the limits to the restriction being created. Each of these is a
member of a special class of elements called 'facets'. Facets are used to fine
tune the specification of a custom simple type. For those who like formal
statements, the following two statements are each a part of the official definition
of a facet and a value space.

Abbreviated W3C Formal Definition of a Facet and a Value Space

Definition: A 'facet' is a single, defining aspect of a value space. A 'value
space' is in turn defined as the set of values for a given data type.

// paraphrase

A facet is a defining aspect of a value or set of values.

XML Schema Language defines 15 facets, 12 of which can be applied to
simple types. Facets are themselves built-in simple element types. Facets
allow greater control over the specificity of definitions for simple types.
Following is a bulleted list of the facets in alphabetical order.

XML Schema Facets

enumeration // a list of acceptable values
fractionDigits // the max. # of decimal places allowed *
length // exact # characters or list items allowed *
maxExclusive // upper bounds for numeric values ( less than a value )
maxInclusive // " " " " ( less than or equal to a value )
maxLength // max. # of characters or list items allowed *
minExclusive // lower bounds for numeric values (must be greater than this value)
minInclusive // " " " ( must be greater than or equal to this value)
minLength // the min. # of characters or list items allowed *
pattern // the exact sequence of characters that are acceptable
totalDigits // the exact number of digits allowed. +
whiteSpace // white space handling (line feeds, tabs, spaces, and carriage returns)

// "zero or greater, + greater than zero

Element Content

So far, we have seen many combinations of elements nested inside other
elements. The XML Primer goes on to describe three special cases. The
case where an element contains only character data and has attributes,
the case of the mixed type, where elements and character data represent
the combined content of a document and case where an element is defined
that has no content.

1) Declaring an element that has an attribute and contains a simple value

Declaring an element that has an attribute and contains a simple value sounds
simple. So what is the problem? The instance document might have an element
that looks like the following W3C example.

A tag in an instance document

As a starting point we attempt to create a simple type as follows.

A simple type example

<xsd:element name="internationalPrice" type="xs:decimal"/>

// can't add attribute to definition of a simple type

Here is the but! Simple types can't have attributes. Solution? The solution
is to derive a complex type that is based on simple content, using the
<simpleContent> element. In the next example the xs:decimal type is
specified as an attribute of an 'extension' element The extension element
itself contains an attribute set to the 'string' type. The <complexType>
element is used to house the structure. In the following example, the
complexType element is in the 'anonymous' form being an inlined
sub-section of the 'internationalPrice' element. ( In ''Developing Java
Web Services" by R. Nagappan et. al the 'anonymous' form is described
as 'implicit' or 'nameless'. )

Deriving a Complex Type from a Simple Type

<xsd:element name="internationalPrice">    
     <xsd:complexType>
        <xsd:simpleContent>
           <xsd:extension base="xsd:decimal">
              <xsd:attribute name="currency" type="xsd:string"/>
           </xsd:extension>
       </xsd:simpleContent>
     </xsd:complexType>
</xsd:element>

Summary of Steps
// to create an element that holds a simple value and has an attribute

1) Inside an element nest a complexType tag.
2) Nest a simpleType content tag to describe content.
3) Use an extension specify the base.
4) Specify in an attribute tag the attribute name and it's type.

// A named comptexType tag of this variety would also be possible

2) How to create elements that support mixed content

Notice in the following W3C example, the text appears between elements
and their child elements. The form 'inlines' text, elements and sub-elements.

Example

<letterBody>
<salutation>Dear Mr.<name>Robert Smith</name>.</salutation>
Your order of <quantity>1</quantity> <productName>Baby
Monitor</productName> shipped from our warehouse on
<shipDate>1999-05-21</shipDate>. ....
</letterBody>

Following is the schema that make this xml possible. The key to making
the mixture possible is the use of the mixed attribute (highlighted). Setting
the 'mixed' attribute to 'true' allows character data to appear between child
elements.

Example

<xsd:element name="letterBody">
<xsd:complexType mixed="true">
<xsd:sequence>

   <xsd:element name="salutation">
    <xsd:complexType mixed="true">
     <xsd:sequence>
      <xsd:element name="name" type="xsd:string"/>
         </xsd:sequence>
        </xsd:complexType>
       </xsd:element>

   <xsd:element name="quantity"    type="xsd:positiveInteger"/>
   <xsd:element name="productName" type="xsd:string"/>
   <xsd:element name="shipDate"    type="xsd:date" minOccurs="0"/>
   

</xsd:sequence>
</xsd:complexType>
</xsd:element>

The XML Primer notes the mixed model in XML Schema is fundamentally different
from the DTD mixed model used in XML 1.0. Under the XML Schema mixed model,
the order and number of child elements appearing in an instance must agree with the
order and number of child elements specified in the model. In contrast, under the XML
1.0 mixed model, the order and number of child elements appearing in an instance
cannot be constrained. This means, the XML Schema provides full validation of
mixed models while XML 1.0 provided only a partial schema validation.

The following example from the W3Schools web site shows an element from an
XML instance that adheres to the mixed content schema dictated in the element
declaration that follows.

Example from W3Schools Website

<letter>
Dear Mr.<name>John Smith</name>.
Your order <orderid>1032</orderid>
will be shipped on <shipdate>2001-07-13</shipdate>.
</letter>

Corresponding Schema

<xs:element name="letter">
  <xs:complexType mixed="true">
    <xs:sequence>
      <xs:element name="name" type="xs:string"/>
      <xs:element name="orderid" type="xs:positiveInteger"/>
      <xs:element name="shipdate" type="xs:date"/>
    </xs:sequence>
  </xs:complexType>
</xs:element>

Empty Content

Now suppose that we want the internationalPrice element to convey both the
unit of currency and the price as attribute values as in the following example.

Example <internationalPrice currency="EUR" value="423.46"/>

Such an element has no content at all; its content model is empty. To define
a type whose content is empty, we use a type that first disallows anything
but elements in it's content and then goes on to prevent any elements from
being added. This way the type's content model is left empty.

An Empty Complex Type

<xsd:element name="internationalPrice">
<xsd:complexType>
<xsd:complexContent>           // complexContent yet no elements defined
   <xsd:restriction base="xsd:anyType">
    <xsd:attribute name="currency" type="xsd:string"/>
    <xsd:attribute name="value"    type="xsd:decimal"/>
   </xsd:restriction>
</xsd:complexContent>
</xsd:complexType>
</xsd:element>

But, the complexContent element with the restriction to 'anyType'
is the default form so these elements may be eliminated to create
the more natural form that follows.

// 'anyType' is the primordial root type of the schema data types.

Shorthand for an Empty Complex Type

<xsd:element name="internationalPrice">
<xsd:complexType>
<xsd:attribute name="currency" type="xsd:string"/>
<xsd:attribute name="value" type="xsd:decimal"/>
</xsd:complexType>
</xsd:element>

Lists & Unions

The XMLs Concept of a List Type

XML Schema makes use of the concept of a 'list' type. List types are categorized
as simple types because they are comprised of sets of 'atomic' types. Atomic types
are the simple types and are considered indivisible. For instance, the name token,
or NMTOKEN value "US" is an atomic type. There are no intended sub-units of "US"
such as "U" or "S".

List types are represented as a white-space separated sequence of atomic types.
Following we rephrase the form element for 'simpleType' without the attribute
descriptions to emphasize that 'simpleType' may include a list element.

Simplified simpleType Form

<simpleType > (annotation?,( restriction | list | union )) </simpleType>

Creating New List Types

You can create new list types by derivation from existing atomic types.
(You cannot however create list types from existing list types, nor from
complex types.) Following is the W3C example of a list of integer values.

// implies you cannot extend an extension

W3C Example of a List of 'myInteger', Custom Integer Types

<xsd:simpleType name="listOfMyIntType">
 <xsd:list itemType="myInteger"/>
</xsd:simpleType>

The subsequent example shows that a conforming XML instance element
can contain a space separated list of this element's type.

W3C Example of that Conforms to the Above list Type Definition

Union Types

While a list defines an aggregation of several similar type values, a union adds
a level of complexity allowing the creation of types that contain multiple atomic
types which may include list types. A union type is always a derived type and
by definition must be made up of at least two different 'member' types.

The following example of a union type from W3Schools. Notice the union type
element is composed from the two following simple type definitions, 'sizebyno
and 'sizebystring'.

W3Schools Example

<xsd:element name="jeans_size">
<xsd:simpleType>
<xsd:union memberTypes="sizebyno sizebystring" /> // notice the space separated types
</xsd:simpleType>
</xsd:element>

<xsd:simpleType name="sizebyno">
<xsd:restriction base="xsd:positiveInteger">
<xsd:maxInclusive="42"/>
</xsd:restriction>
</xsd:simpleType>

<xsd:simpleType name="sizebystring">
<xsd:restriction base="xsd:string">
<xsd:enumeration value="small"/>
<xsd:enumeration value="medium"/>
<xsd:enumeration value="large"/>
</xsd:restriction>
</xsd:simpleType>

In the above examples, legal values for the union type are an integer value from
1 to 42, inclusive, or one of either "small", "medium" or "large".

All, Choice & Sequence Groups

Recall the <complexType> element required a 'compositor' element, one of
'sequence', 'all' or 'choice'. Compositors get their name from the fact that
they create groups of elements.

Choice Groups

XML Schema language offers the ability to choose which element is shown
in an instance document using the 'choice' element. The choice group element
allows only one of its children to appear in an instance. (Note 'choice' allows
selecting between child elements while the earlier 'enumeration' element was
used in simpleTypes to provide a choice of values for the type.) Following is
an example that allows the choice of either an 'Air', 'Rail' or 'Sea' element.

Example

<xs:element name ="carrier" >
<xs:complexType >
   <xs:choice>
          <xs:element ref`="Air" />
          <xs:element ref="Rail"/>
              <xs:element ref="Sea"/>

   </xs:choice>
<xs:complexType>
</xs:element>
. . . .

The 'all' Element

The <all> element allows elements to occur in any order. The elements in
an all group appear as dictated by minOccurs and maxOccurs. In the default
scenario, where both attributes are set to ' 1 ', all elements must appear though
in any order. If 'minOccurs' is set to zero in an element, this makes this element
optional. The 'maxOccurs' attribute cannot be greater than ' 1 '. In other words,
no element in the content model can appear more than once.

The following W3C example would permit child elements to appear in any order,
with the comment type appearing optionally.

An all group Example

<xsd:complexType name="PurchaseOrderType">
<xsd:all>
    <xsd:element name="shipTo" type="USAddress"/>
    <xsd:element name="billTo" type="USAddress"/>
    <xsd:element ref="comment" minOccurs="0"/>
    <xsd:element name="items" type="Items"/>
</xsd:all>
<xsd:attribute name="orderDate" type="xsd:date"/>
</xsd:complexType>

The attributeGroup Element

It is often convenient to group a set of attributes together and then
reference the group into an element. The following attribute group
shows three attributes, that together form a complicated data structure.
Bundling them into a group that can be used in different sorts of similar
elements may provide a convenient way to keep schema code readable.

W3C Example of an attributeGroup Element Definition

<xsd:attributeGroup name="ItemDelivery">


<xsd:attribute name="partNum" type="SKU" use="required"/>


<xsd:attribute name="weightKg" type="xsd:decimal"/>


<xsd:attribute name="shipBy">
    <xsd:simpleType>
     <xsd:restriction base="xsd:string">
      <xsd:enumeration value="air"/>
      <xsd:enumeration value="land"/>
      <xsd:enumeration value="any"/>
     </xsd:restriction>
    </xsd:simpleType>
</xsd:attribute>

</xsd:attributeGroup>

The line that references the definition below into an element reuses
the attributeGroup element in conjunction with the 'ref' attribute.

W3C Excerpt Showing an Attribute Group Being Referenced
. . .
<xsd:attributeGroup ref="ItemDelivery"/>
. . .

Annotations

XML Schema provides three elements for commenting schemas for human or
application information.

The annotation element -The Annotation tag is the parent element of the
documentation and the appInfo elements. The 'documentation' and 'appInfo'
elements are nested inside a annotation element.

The documentation element - The documentation element is recommended
for providing humanly readable material. It is also recommended that the xml:lang
attribute is used to indicate the language of this information. You may also indicate
the language of all information in a schema by placing an xml:lang attribute on the
schema element.

Example <schema xml:lang="en"> // language for all schema information

The appInfo element - The appInfo element provides information for applications
that may be associated with processing the XML document such as stylesheets or
graphics programs.

The annotation will often appear at the beginning of a schema construction. Following
is a W3C example that shows annotation elements with enclosed documentation
elements used to comment the internationalPrice element. The comments are offset
with italics to show more clearly the role they play in the tag.

Annotations in Element Declaration & Complex Type Definition

<xsd:element name="internationalPrice">

<xsd:annotation>
<xsd:documentation xml:lang="en">
element declared with anonymous type
</xsd:documentation>
</xsd:annotation>

<xsd:complexType>

<xsd:annotation>
   <xsd:documentation xml:lang="en">
       empty anonymous type with 2 attributes
   </xsd:documentation>
</xsd:annotation>

<xsd:complexContent>
   <xsd:restriction base="xsd:anyType">
    <xsd:attribute name="currency" type="xsd:string"/>
    <xsd:attribute name="value"    type="xsd:decimal"/>
   </xsd:restriction>
</xsd:complexContent>
</xsd:complexType>
</xsd:element>

Self Test Self Test With Answers

1) True or False? The root element of a schema document is the schema
element. True \ False

2) True or False? A simple Element can contain text and attributes.
True \ False

3) True or False? The default value for minOccurs and maxOccurs is 1.
True \ False

4) True or False? Attributes, like simple elements, use minOccurs and
maxOccurs to control occurences.

5) True or False? Default values only make sense if attributes are optional.
True \ False

6) True or False? The Union element can be composed of different sorts of
simple and complex types. True \ False

7) True or False? Declaring an element that has an attribute and contains a
simple value would require something like the following. True \ False

<xsd:element name="message" type="xsd:string"/>

8) What is the attribute that allows an element to support mixed content?
________

Exercise

Create a schema to govern a client record for a company.

The record itself will be a complex element with attributes
that specify a date the record was created, and an attribute
that holds a unique identifier. ( This might be created as
an attribute element that is typed to the built in 'ID' type.)

<>Nested in the record element will be a complex element that
contains a sequence of elements representing a client's first
name, an initial and a last name. Make the initial an optional
element utilizing the 'minOccurs' attribute.

Create a second complexType element that holds address
information. This element should contain elements that
reference elements named 'street', 'city', 'country' and 'postalCode'.

The 'street' and 'city' element will be a simple element that uses
the built in schema string type. The 'country' element will be a
simple type derived by restriction that allows a enumeration of
abbreviated tokens limited to countries in North America.
The postalCode element will be a complexType element that
offers a choice of two simple elements that represent ZIP or
POSTAL code.

// You can do the ZIP and POSTAL code as simple string types
// or if you wish use a derived simpleType and the pattern facet
// to limit characters to those appropriate for each format.

Create an element using a 'all' element to classify this client
as a cash customer, an private account holder, a corporate
representative or all three. If a corporate representative, the
name of the company should be specified.

// an all group with all elements marked optional including
// company name would allow combinations of elements
// to be specified.

Use schema shells described earlier in the note to complete your
schema document. Create an instance document that adheres to
your schema definition and validate the document using any of the
validation methods suggested.

If this is all mystifying and you are quite new to XML you may
use the following summary to help you organize this assignment.

Summary of Requirements

Client_Record date_attribute ID_attribute // root element

    name element
         // a complex type
                sequence
                       first name
                      initial   // optional
                       last name
   // closing tags

       address element
sequence
   element ref="street"
element ref="city"
   element ref="country"
element ref="postalCode"

      element street
element city
   element country         // simple derived type based on enumeration facet
element postalCode    // choice of two simple or derived simple types
// closing tags

            clientType
all
cash
private account
corporate account
   companyName
                                                 // closing tags