The syntax rules of XML are very simple and logical. The rules are easy to learn, and easy to use.
In HTML, some elements do not have to have a closing tag:
In XML, it is illegal to omit the closing tag. All elements must have a closing tag:
Note: You might have noticed from the previous example that the XML declaration did not have a closing tag. This is not an error. The declaration is not a part of the XML document itself, and it has no closing tag.
XML tags are case sensitive. The tag <Letter> is different from the tag <letter>.
Opening and closing tags must be written with the same case:
Note: "Opening and closing tags" are often referred to as "Start and end tags". Use whatever you prefer. It is exactly the same thing.
In HTML, you might see improperly nested elements:
In XML, all elements must be properly nested within each other:
In the example above, "Properly nested" simply means that since the <i> element is opened inside the <b> element, it must be closed inside the <b> element.
XML documents must contain one element that is the parent of all other elements. This element is called the root element.
XML elements can have attributes in name/value pairs just like in HTML.
In XML, the attribute values must always be quoted.
Study the two XML documents below. The first one is incorrect, the second is correct:
The error in the first document is that the date attribute in the note element is not quoted.
Some characters have a special meaning in XML.
If you place a character like "<" inside an XML element, it will generate an error because the parser interprets it as the start of a new element.
This will generate an XML error:
To avoid this error, replace the "<" character with an entity reference:
There are 5 predefined entity references in XML:
< | < | less than |
> | > | greater than |
& | & | ampersand |
' | ' | apostrophe |
" | " | quotation mark |
Note: Only the characters "<" and "&" are strictly illegal in XML. The greater than character is legal, but it is a good habit to replace it.
The syntax for writing comments in XML is similar to that of HTML.
<!-- This is a comment -->
HTML truncates multiple white-space characters to one single white-space:
HTML: | Hello Tove |
Output: | Hello Tove |
With XML, the white-space in a document is not truncated.
Windows applications store a new line as: carriage return and line feed (CR+LF).
Unix and Mac OSX uses LF.
Old Mac systems uses CR.
XML stores a new line as LF.
XML documents that conform to the syntax rules above are said to be "Well Formed" XML documents.