XML: The What, Why, Where and Hows Explained by a Newbie

If you’re looking for an in depth, super technical, ultra descriptive piece about what XML is and what it does then (if I were you) I’d probably keep searching. However, if (like me) you’ve just found out that XML is not just a representation of someone who doesn’t know their alphabet then let me enlighten you on the very basics.

Firstly, XML stands for EXtensible Markup Language. The key word there is ‘Extensible’. This means it is a markup language that can be expanded or added to by it’s users and still be usable by the application that is displaying it. Okay so I’m jumping ahead of myself. I should explain that XML doesn’t actually do anything. It is just information surrounded by tags. Software must be used if one would like to store, display, send and receive it.

So you’re probably asking “If it doesn’t do anything, then what’s the point of it?” Well let me inform. One of XML’s main features is it is self-descriptive, meaning it can be read by both humans and machines. With many systems today containing data in conflicting formats, large amounts of data needs converting, due to this it is time consuming and some data is often lost. However, because XML stores data in a plain text format it means many new, old and upgrading systems can read the same information with no data lost and it can be converted incredibly quickly.

The Components of XML

So what does this magical XML look like? Below is an example of some simple XML code:

<email>
   <date> 26/07/2012 </date>
   <time> 17:07 </time>
   <from> Janet </from>
   <to> James </to>
   <subject> Just saying hi </subject>
   <body> Hi James, had a lovely chat with you today. You must come over soon. </body>
</email>

From the example above, it is very clear to decipher that it is an email addressed to James from Janet on the date of the 26th July 2012 at 5:07pm. The subject and body of the email are also included. As you can see, the code is made up by different parts.

Arrows pointing to XML tags and the whole XML element

Like HTML attributes can also be used in XML (see below).

<person gender=”female”>
   <name> Janet Smith </name>
   <age> 43 </age>
</person>

Hopefully, you could decipher that the text above is describing Janet Smith giving information about her gender and age. However, the gender here, has been shown as an attribute. Attributes are designed to contain data related to a specific element. Attribute values must be quoted, by either double or single quotes.

<person gender=”female”>

Unlike HTML, which relies on predefined tags, XML’s tags are solely created by the user. However, these two markup languages are often in partnership. Put simply: XML stores and transports the data, while HTML formats and presents it.

The Rules of XML

Like all things there are rules to XML. These are known as the syntax rules. Let’s go through the common 3:

1. All XML documents must contain a root element which is the parent of all the other elements.

<root>
   <child>
      <subchild>...</subchild>
   </child>
</root>

2. All XML elements must have a closing tag.

<yes> Am I doing it right </yes>
<no> Am I doing it right

3. XML tags are case sensitive. So the tag < email> is different to < Email> and these are both different to < EMAIL>. Opening and closing tags must be the same.

<yes> Am I doing it right </yes>
<Yes> Am I doing it right </Yes>
<no> Am I doing it right </No>

Of course there are many more, but like the above they are all pretty simple.

XML Schemas

However, as mentioned XMLs tags and elements are made up by the user. Due to this it can lack structure and may be hard to find software that translates the code into the format that one wishes. Content models such as DocBook help with this. DocBook is a collection of standards and tools for technical publishing, originally created by software companies as a standard for computer documentation, it can now be used for other kinds of content and has been adapted for many purposes. DocBook provides a number of tags that allow the user to easily publish the documents in any other form of documentation such as PDF and HTML. Other content models include: DITA (Darwin Information Typing Architecture), S1000D, XBRL and more.

To Conclude

All in all, XML may seem a bit complex to understand, but it is being used in many different areas for its simplicity. It chooses brains over beauty, focusing more on the logical information, rather than how it is formatted. Therefore, it is used in various different professions and industries. Including, finance, publishing, medicine, science and many more.

Keep Reading

Move detection when comparing XML files

/
DeltaXML introduces an enhanced move detection feature that provides a clearer insight of how your content has changed.

Configuring XML Compare for Efficient XML Comparison

/
Define pipelines and fine-tune the comparison process with various configuration options for output format, parser features, and more.

A Beginner’s Guide to Comparing XML Files

/
With XML Compare, you receive more than just a basic comparison tool. Get started with the most intelligent XML Comparison software.

Introducing Character By Character Comparison

/
Find even the smallest differences in your documents with speed and precision with character by character comparison.

Cyber Resilience for SMEs: A Chat with DeltaXML’s Systems Administrator

Peter Anderson, IT System Administrator, relays the importance of cyber resilience for SMEs.

S1000D and Beyond: Navigating the Skies of Aviation Data with XML

/
This blog explores the significance of XML in aviation data management, highlighting standards like S1000D.

File Formats and ConversionQA Functionality

ConversionQA is a tool by DeltaXML ensuring the success of content conversion projects by comparing content from any two XML formats.

Introducing ConversionQA

ConversionQA is introduced as a solution to comparing content across different XML formats, addressing scenarios like content conversion and restructuring documents.

Never miss an update

Sign up to our newsletter and never miss an update on upcoming features or new products