An XML document must follow simple syntax rules in order to be considered well formed. This stuff is so simple that it drives me batty that all these XML books and articles have to explain it all the time.
An XML document consists of a prolog and a root element.
A prolog is the section before the root element. It can consist of processing instructions, meta data items, or comments.
<?xml version="1.0"?>) at the start of the first line of an XML document. Attributes such as encoding="UTF-8" standalone="yes" may be included in the XML declaration, but they must be put in that order: version, encoding, and then stand alone.<?xml-stylesheet href="/style.css" type="text/css" title="default stylesheet"?>.<?xml-stylesheet href="/style.xsl" type="text/xsl" title="default stylesheet"?>.<!DOCTYPE note SYSTEM "InternalNote.dtd">.<!-- Comments -->.<element <!-- This is illegal -->>.<one /> or <two></two>.<tag>Element content</tag>.<tag AnAttribute="attribute value" />.<elt a="1" a="2"/> is not allowed.:)_)xml, in any combination of upper or lowercase.Element relationships are frequently discussed using the usual family or tree metaphors. EG: It is easy to use the following xml section to identify parent, child, grandchild, grandparent, sibling, root, leaf, ancestors, descendants, etc.
<a>
<b>
<c>
<d/><e/><f/><f/>
</c>
</b>
</a>
<![CDATA[ Information here is data for the XML document, but is ignored when it is parsed. <anytag>EG this element is ignored</anytag> The tag following the next line is again part of the XML code. ]]>
It is relatively simple to decide whether to put data in elements or in attributes of elements.
| Elements | Attributes |
|---|---|
| Data is read by people | Data is read by machines |
| Repeating data | Non-repeating data |
| Hierarchical data | Flat, childless data |
| Ordered data | Order irrelevant |
| Values vary | Enumerated values |
| File size irrelevant | Reduce file size |
Note that some processes (like SQL Server and ADO) default to putting database columns as attributes.
There are a few basic rules that make a well-formed XML document.
Elements can be nested but not overlapped. EG:
<e>the quick<f>brown fox</f> jumped.</e> CORRECT!
<e>the quick<f>brown fox</e></f> INCORRECT!
Elements and attribute names are case sensitive. EG:
<e>the quick.</e> CORRECT!
<e>the quick.</e>
<E>the quick.</E> CORRECT!
<E>the quick.</E>
<E>the quick</E> INCORRECT!
<e>the quick.</e>
<e>the quick</E> INCORRECT!
Non-empty elements require end tags. EG:
<e>the quick.</e> CORRECT!
<e>the <e>quick INCORRECT!
Empty elements must have a beginning and an end tag or just a single tag ending with /> ( or /> because some parsers/browsers prefer that). EG:
<e><e/> CORRECT!
<e />
<e/>
<e> INCORRECT!
<e>
Attributes values must be in quotes, either single or double. EG:
<table rows="2"> CORRECT!
<table rows=2> INCORRECT!
Attributes cannot be minimized or empty. EG:
<dl compact="compact"> CORRECT!
<dl compact> INCORRECT!
Char ::= #x9 | #xA | #xD | [#x20-#xD7FF] | [#xE000-#xFFFD] | [#x10000-#x10FFFF]
<?xml version="1.0" encoding="UTF-8"?> <!DOCTYPE xsl:stylesheet [<!ENTITY nbsp " ">]>
Page Modified: (Hand noted: 2007-10-04 19:25:55Z) (Auto noted: 2008-03-12 16:11:27Z)