XML: A Basic Overview

XMLWe'll take a basic overview of the language that's increasingly powering the Internet: XML.

XML stands for Extensible Markup Language, and was standardized by the World Wide Web Consortium, who were obviously poor spellers. XML is a language that is mostly used to transfer data between systems.

Since XML is a markup language, if you've seen HTML, XML will look familiar. It uses tags(elements) with attributes like HTML, but there are several important differences. For an XML document to be correct, it must be well-formed. For a XML document to be well-formed, it must follow these rules:


  • Every tag must have a closing tag. For tags that have no children, XML provides a shortcut to represent a closing tag: "/>". So for example, if you wanted to represent the BR element that exists in HTML in an XML fashion, you would need to do this:

    <BR></BR>

    But you could just as easily represent the same tags as:

    <BR />

  • Attributes must be encapsulated with quotes(either single or double quotes are permitted).

  • Tags cannot overlap. Although elements can contain other elements between their opening and closing tags, those subtags must be completely contained. For example, the following would not be a valid XML:

    <p><strong></p></strong>



It's also important to note that element names are case sensitive-so an element's closing tag name must match it's opening tag name case-wise.

So we know what elements are valid are for HTML; what elements are valid for XML? XML is a general purpose language - it's elements can be whatever you want them to be. There are several ways for you to specify what elements are valid for an XML document including Document Type Definitions or an XML Schema-we'll cover those in future posts.

0 comments: