If you’re reading up on MathML, chances are you’ve already heard of the markup language it was created for: XML. For those unaware, XML is a markup language and file format for storing, transmitting, and reconstructing data. It was designed specifically for simplicity, generality and usability across the internet. XML standardises file formats, so, for two disparate systems, exchanging information becomes a lot easier. For those needing to represent and store mathematical notations and formulas within XML, MathML (which stands for Mathematical Markup Language) takes centre stage.
MathML: The need to know
A brief history
April 1998 marked the day that MathML 1 was released as a W3C recommendation. It was the first XML language to be recommended by the W3C and was originally designed before the finalisation of XML namespaces. Its aim is to integrate mathematical formulae natively into World Wide Web pages and other documents. The latest specification, MathML Core, was published August 2021 and described as the “core subset of Mathematical Markup Language, or MathML, that is suitable for browser implementation.”
MathML core differs from the latest MathML specification (Version 3, released 20th October 2010) by including automated browser support testing resources, and focusing on a fundamental subset of MathML as well as detailed rendering rules and integration with CSS.
Presentation MathML vs Content MathML
The significant benefit to MathML is that it’s not only aware of the presentation of mathematical formulas, but the meaning of them. We separate the two into Presentation MathML and Content MathML.
Presentation MathML
Presentation MathML, is as expected, concerned with the display of the equation, i.e. how the mathematical expression will look in the web browser or document. To express presentation MathML, you combine higher-level layout elements with token elements. The elements’ names all begin with m and there’s around 30 elements in total. Token elements include:
< mn >
= A token element classifying a number (e.g., 1, 89, 345).
< mo >
= A token element classifying an operator (e.g., +, -, x).
While higher-level layout elements include:
< mfrac >
= A layout element classifying fractions.
< mrow >
= A layout element classifying a horizontal row of items.
Content MathML
While presentation MathML concentrates on how the formula looks, content MathML focuses on the meaning behind the formula. And because the meaning of the equation is preserved separately from the presentation, how the content is communicated can be left up to the user. When displaying formulas in the web browser, the content can be viewed as the text, or for visually impaired users can be read aloud by screen readers.
The < apply >
element is crucial here, as it represents function application. Although Content MathML uses only a few attributes, there are a hundred or more different elements for different functions and operators. Elements within Content MathML are marked up using token elements, such as:
< ci >
= A token element classifying an identifier (e.g., x, b, a).
< cn >
= A token element classifying a number (e.g., 1, 89, 345).
What does MathML look like?
Now we have a sense of how it functions, what does it look like in practise? Below is a very simple example of what Presentation MathML looks like in both code and web format:
Code: