XML at a Glance
Peter Komisar  version 1.0  © 
2008

references: XML & Web Services Unleashed , R.Schmeizer et.al 'The
XML Bible' ,/Elliotte Harold,''The Birth of XML', Jon Bosak,
http://java.sun.com/xml/birth_of_xml.html,
  XML in a Nutshell, ER.
Harold & W.S Means, 'Mastering XML', Nazarro, White & Burman
'Professional XML Schemas', J. Duckett et.al.



GML, SGML & HTML

Charles Goldfarb, Ed Moscher and Ray Lorie, in 1969, at IBM
created GML, General Markup Language to simplify handling
legal documents. The moniker is based on their initials. The
documents were in different formats and relied on different
platforms, so they sought to design a unified cross-platform
mark-up language. The standard version of GML became
known as SGML.

The key design feature of SGML is it allowed the creation of
custom tags.

SGML though became complex and difficult. It was also thwarted
by Tim Berners-Lee's HTML which was a simple language written
in SGML. It became a 'best seller' as it was relatively easy, and
worked for 90% of uses.

HTML was though still limited as it was not customizable.


The Advent of XML  
// a refinement of SGML 

Jon, Bosak, Tim Bray and C.M. Sperberg-Queen, Jean Paoli and
James Clark,
  many of whom were SGML pioneers sought to filter
SGMLs best features and
port them to the web. In a sense XML
was a distillation of SGML
a little like Java is a simplification
of C++.)

John Bosak offers his own recollection of how XML came to
be at    http://java.sun.com/xml/birth_of_xml.html

// might have to search at Sun
 


Hello World in XML




We can introduce the general look of XML in a quick Hello
World version.

Example   

<?xml version = "1.0" standalone="yes"?>
      <Earth>
         Hello World in XML!
       </Earth>

Write or copy and paste this text into a simple text editor like
Notepad and save
it to a name with an .xml ending, such as
HelloWorld.xml. Once saved, it can be
opened into a browser.
The result is not very exciting as there is no formatting

associated with the text.

If we look at the <Earth> element, we can see that XML uses
'tags' or sets of enclosing braces that surround identifiers. A
'start' and 'end' tag are distinguishable.The 'end' tag includes
a forward slash ahead of the element identifier.

A named tag is called an element. Elements can also contain
attributes. In the following example the attribute called 'type'
holds the value 'planet'.

Example 

< Earth type="planet" ></Earth>

Note in this reiteration of the element we left the content out.
This creates an 'empty' element. XML supplies an abbreviated
form for an 'empty' element as  follows.

Example 

< Earth type="planet" />

This is a recommended form as it reduces the risk of creating
an  'orphaned' end tag.

Speaking of form XML describes what makes a 'well formed'
XML document.




Well Formed and Valid XML



Rules Governing XML Structure

1) XML Elements must have closing tags. That means all tags.

Example

<Break></Break> or <Break / >
 

2)  XML Elements unlike HTML are case sensitive.

Example   

<GO / > is not the same as <go / >
 

3) XML tags must all be properly nested. In other words tags
must be
closed in reverse order they are opened. Below the
tags open One,
Two Three, and close Three, Two One.

Example        

<One> <!-- opens -->
    <Two> <!-- opens -->
        <Three> <!-- opens -->
        </Thee> <!-- closes -->
    </Two>  <!-- closes -->
 </One>  <!-- closes -->

4) XML Documents must have a single root element. This
implies all elements
of a document are nested inside the
root document. The root identifier is the
same type as is
declared in the document type declaration if one is present.
 
 

5) Attribute values must all be quoted, by convention using
double quotes.

Example    

number="1029383454738";
 

6) Attributes may only appear once in an element.

Example 

<!-- can't have -->  < X x = "y"   x = "z" >
 

7) Attribute values cannot contain references to external
entities. XML text
can reference XML external entities but
not tag attributes. Attributes can
use internally defined and
pre-defined entity references.

Example     

< ANC  nac = "CNA&apos;S"  >

8) Entities must be declared before they are used. Predefined
entities are
already defined so they are ready to go. 
// entities can't be forward referenced  
 

Well Formed XML    
 // defining what a well formed XML document is

In the first case, XML requires that a document be 'well formed'.
To be well
formed a document must follow the above stated
rules and in addition, the
document must not contain markup
or characters that XML cannot process.

 

Formula For Well Formed XML

Adherence to Structural Rules + Correct Syntax  =  Well Formed XML
 

Valid XML

An XML document is considered valid if it is first well-formed
and in addition
it has a document type definition, a DTD or an
XML Schema, that describes
constraints that the document is
in compliance with.

A well formed document can be used without a schema. This
will automatically
limit it from using certain advanced XML
features that are available only through
some form of document
type declaration.

This is just a glimpse of the detailed specification that is
used in XML. It is enough though to make looking at the
XML configuration files used in the J2EE specification more
more meaningful.



The Use of XML Files in J2EE Configuration

The grandest scheme envisioned for XML is the creation
of a distributed service system that is platform independent.
While there has been a lot of effort dedicated to this end a
lesser but a very important task, the creation of computer
configuration files has fallen to XML.

Web Applications as well as the larger enterprise applications
have as part of the J2EE specification, configuration files that
are written in XML. Following are some examples.

Following is a brief example of an application.xml file which
comes with the Sun Enterprise download. This is the deployment
descriptor for the EAR file which is an enterprise application
packaged in J2EE format.

Sample Location Inside Sun Download

C:\Sun\SDK\docs\firstcup\example\firstcup\src\conf


Brief Description of the application.xml File

The following XML shows an root tag called application.
It contains two modules. One holds a reference to a war
file that contains web resources. The context root, from
where the web pages will be referenced, is also noted.
The second holds a jar that houses a sample enterprise
Javabean.

The long xsi:schemaLocation basically says where the
XML schema may be found. The schema is the data
structure type definition for the file.


An application.xml Sample
// java.sun.com

<application version="5"
xsi:schemaLocation="http://java.sun.com/xml/ns/javaee
http://java.sun.com/xml/ns/javaee/application_5.xsd">
<display-name>firstcup</display-name>

<module>
    <web>
       <web-uri>firstcup-war.war</web-uri>
       <context-root>/firstcup</context-root>
    </web>
</module>

   <module>
       <ejb>firstcup-ejb.jar</ejb>
    </module>
</application>

Looking around the directories in the Sun SDK, and we
can find the web.xml file which is the deployment descriptor
for the web jar, ( or 'war'  file ), associated with the above
application.

Brief description of the web.xml File

The root element of the web.xml file is web-app. Configuration
information is shown for a servlet. It includes the name of the
servlet, it's fully qualified name and an indicator to say to load
on start-up. It also has a mapping tag that shows where the
file will be relative to the context root. There is also a time
specified to time-out a session.

An web.xml

<web-app
xsi:schemaLocation="http://java.sun.com/xml/ns/javaee
http://java.sun.com/xml/ns/javaee/web-app_2_5.xsd" version="2.5">


   <servlet>
       <display-name>FacesServlet</display-name>
       <servlet-name>FacesServlet</servlet-name>
        <servlet-class>javax.faces.webapp.FacesServlet</servlet-class>
        <load-on-startup>1</load-on-startup>
   </servlet>

    <servlet-mapping>
       <servlet-name>FacesServlet</servlet-name>
       <url-pattern>/firstcupWeb/*</url-pattern>
     </servlet-mapping>

    <session-config>
        <session-timeout>
            30
        </session-timeout>
     </session-config>
</web-app>


A Simple  ejb-jar Example


The following can be found at the Apple site.

Source Location

http://developer.apple.com/internet/java/examples/entitybean5_source.html


This code is a little older and uses a DTD definition for the
data structure rather than the newer XML schema
language.

The root tag is called ejb-jar and holds an enterprise-beans
tag. This tag in turn holds an 'entity' tag which tells
what type of Enterprise JavaBean this is. Besides a name
and a description, the name of the two RMI interfaces
is included, represented by the 'home' and 'remote'.
Interfaces. The persistance model is specified as
'CMP' or container managed persistence. Also the
key identifier, the primary key field is named. This
is the handle the system uses to access this entity
bean.

This description will make more sense after we have
looked at Enterprise JavaBeans.


Anejb-jar.xml Example

<?xml version="1.0"?>
<!DOCTYPE ejb-jar PUBLIC "-//Sun Microsystems,
Inc.//DTD Enterprise JavaBeans 1.1//EN"
"http://java.sun.com/j2ee/dtds/ejb-jar_1_1.dtd">

<ejb-jar>
<display-name>HelloEntity</display-name>
<enterprise-beans>
<entity>
<description>Extremely Simple Entity bean, models a person</description>
<ejb-name>HelloEntity</ejb-name>
<home>HelloEntityHome</home>
<remote>HelloEntity</remote>
<ejb-class>HelloEntityEJB</ejb-class>
<persistence-type>Container</persistence-type>
<prim-key-class>java.lang.String</prim-key-class>
<primkey-field>name</primkey-field>
<reentrant>False</reentrant>
<cmp-field><field-name>name</field-name></cmp-field>
<cmp-field><field-name>email</field-name></cmp-field>
</entity>
</enterprise-beans>
</ejb-jar>


Assignment

Just to get a hands on feel for XML, consider that XML
is also the data conduit for Web Services. To this end
create an XML file that will serve as a sample object for
a Memo object. The root tag will be Memo. Internal tags
will be be 'To', From' 'Date' and 'Message'. 

Make sure the tag view is viewable in a browser. This
tells you, you have correctly formed the