Boxes, Boxes, and More Boxes
Layout Boxes
MathML presentation mark-up is based around the idea of a layout box.
You can think of a layout box as a sort of abstract bounding box for a
particular kind of mathematical notation. Layout boxes naturally fall into
categories based on their contents. Simple layout boxes just contain individual
characters, and their dimensions depend only on the font being used. More
complicated layout boxes arrange their "child boxes" according to some
algorithm. For example, a fraction box arranges two child boxes to be vertically
stacked with a line between, and centered horizontally.
For these cases, the actual dimensions of a layout box depend recursively on
the sizes of the child boxes.
If you think about trying to typeset a mathematical expression by hand, it is
clear why layout boxes are a good idea. The first time you typeset a fraction,
you have to work out the algorithm for computing the horizontal and vertical
positions for the numerator and denominator expressions. Once that is done, you
can teach it to your assistant, and he or she can do all the calculation without
knowing anything but the dimensions of the subexpressions. Or more likely, these
days you create a digital assistant, like WebEQ, to do it.
MathML presentation elements represent abstract typesetting layout boxes.
Roughly speaking, presentation elements correspond to the media-independent
aspects of a typesetting layout box. This abstraction is what we were calling layout
schemata in The Big Picture.
Each element corresponds to a layout schemata that describes how its children
schemata are logically related to each other. A renderer like WebEQ then turns
these logical relations into specific algorithms for physically laying out
equations on the screen. The attributes of an element essentially specify
parameters to the layout algorithm.
As an example, again consider the mfrac element. The mfrac
element represents a fraction layout schema, which expect two children schemata
for the numerator and denominator. There is only one mfrac attribute
"linethickness" which specifies the thickness of the fraction line.
The actual fraction algorithm a render like WebEQ may be substantially more
complicated, depending on how hard it tries to optimize the appearance of a
fraction in unusual situations. But from the point of view of a MathML author,
all of this complexity is hidden by a layer of logical abstraction. Provided the
author has taken care to get the logical structure correct, he or she should be
able to leave the rest to the renderer.
Tokens and Basic Layout Schemata
The most common MathML presentation elements are the token elements mi,
mn and mo. Recall that token elements are the only elements which
directly contain character data, so each individual identifier, operator, and
number that appears in an expression must be wrapped in a token element.
- <mi> ... </mi>
-
- mi elements indicate that their contents should be displayed as
identifiers. This means that single character identifiers like 'x' and 'h'
should appear in italics, while multi-character identifiers like 'sin' and
'log' should be in an upright font.
Attributes include font properties like fontweight, fontfamily
and fontstyle as well as general properties like color.
-
- <mn> ... </mn>
-
- mn elements indicate that their contents should be rendered as
numbers, which generally means in an upright font.
Attributes are like those for mi.
-
- <mo> ... </mo>
-
- mo elements are the most complex token schema. The indicate that
their contents should be displayed as operators, but how operators are
displayed is often quite complicated. For example, the spacing around
operators varies depending on the operator. Other operators like sums and
products have special conventions for displaying limits as scripts. Still
other operators like vertical rules stretch to match the size of the
expression which they enclose.
In MathML, renderering software is expected to contain an "operator
dictionary" which contains information about how different operators
are conventionally rendered. However, everything about how an operator
should be displayed can be controlled directly by using attributes.
Attributes include properties like lspace, rspace, stretchy,
and movablelimits.
The mo element is also used to mark-up other symbols which are
only operators in a very general sense, but whose layout properties are like
those of an operator. Thus, mo elements are used to mark-up delimiter
characters like parentheses (which stretch), punctuation (which has uneven
spacing around it) and accents (which also stretch). One can use attributes
to indicate that the contents of an mo should be treated as one of
these related types.
Now that we are acquainted with a few token elements for marking up individual
characters and symbols, we need some layout schemata for arranging tokens into
expressions. The most common and important general purpose layout schema is the mrow
element. The following list describes mrow and some other common elements
in more detail:
- <mrow> child1 ... </mrow>
-
- The mrow element can contain any number of child elements, which it
displays aligned along the baseline in a horizontal row. However, in
addition to positioning schemata in a row, the mrow is very handy for
grouping together terms into a single unit. One might do this in order to
make a collection of expressions into a single subscript, or one might nest
some terms in an mrow to limit how much a stretchy operator grows,
and so on.
-
- <mfrac> numerator denominator </mfrac>
-
- The mfrac element expects exactly two children, the first of which
will be positioned as the numerator of a fraction, and the second will be
the denominator. By setting the linethickness attribute to 0, the mfrac
element can also be used for binomial coefficients.
-
- <msqrt> child1 ... </msqrt>
-
- The msqrt element accepts any number of children, and displays them
under a radical sign.
-
- <mroot> base index</mroot>
-
- The mroot element is nearly identical to the msqrt element,
except it expects a second child, which is displayed above the radical in
the location of the n in an nth root.
-
- <mfenced> child ... </mfenced>
-
- The mfenced element is like an mrow, except that it displays
enclosed in parentheses. Using attributes, one can set the beginning and
ending delimiter character, as well as internal separator characters like
commas.
-
- <mstyle> child ... </mstyle>
-
- The mstyle element is also like an mrow except that it
handles attributes differently. The mrow element has almost no
attributes of its own, while the mstyle elements can be used to set
any MathML attribute. Just exactly how this works is described in the next
section on inheritance.
Inheritance
Attributes make MathML very flexible, but to use them effectively, you need
to understand how attributes are inherited. Attribute values are basically set
in three ways: they can be explicitly set in a tag, they can be looked up in the
operator dictionary, or they can be inherited from the environment.
Behind the scenes, each element has an environment that specifies default
values for all MathML attributes. Ideally, the environment is initialized by a
browser with sensible values for attributes like color, background,
displaystyle and the font related attributes. Each child element
"inherits" its parent's environment. If an attribute value is not
looked up or otherwise computed, or set directly on the tag, the attribute value
is inherited from the environment.
An important point for understanding inheritance is that ordinarily values
directly set in a tag do not change the default value in the environment. They
only affect the element on which they are set. To change the environment for an
element, and hence for all children of that element, one must use the mstyle
element.
Any presentation attribute can be set using the mstyle element. Values
which are set in this way are inherited by all of the mstyle's children
elements. In other words, attributes set with mstyle are in effect for
all elements within the scope of the mstyle.
Examples
Now that we have met some of the key players, it is time to see what we can do.
Here are some examples and comments which illustrate the use of the basic layout
and token elements. Consider the expression x2 + 4x + 4
= 0. A basic MathML presentation encoding for this would be:
<mrow>
<msup>
<mi>x</mi>
<mn>2</mn>
</msup>
<mo>+</mo>
<mn>4</mn>
<mi>x</mi>
<mo>+</mo>
<mn>4</mn>
<mo>=</mo>
<mn>0</mn>
</mrow>
This encoding will display as you would expect. However, if we were
interested in reusing this expression in unknown situations, we would likely
want to spend a little more effort analyzing and encoding the logical expression
structure.
For starters, our example is more than just one long horizontal row of
symbols. The row naturally breaks up into groups corresponding the the
mathematical terms in the expression, like x2 and the 4x.
Grouping symbols into terms typically won't affect much about the display,
except perhaps linebreaking, but it makes a bigger difference to a computer
algebra system trying to heuristically figure out what the notation means. Thus
a more thorough encoding might look like this:
<mrow>
<mrow>
<msup>
<mi>x</mi>
<mn>2</mn>
</msup>
<mo>+</mo>
<mrow>
<mn>4</mn>
<mi>x</mi>
</mrow>
<mo>+</mo>
<mn>4</mn>
</mrow>
<mo>=</mo>
<mn>0</mn>
</mrow>
This example shows the use of the mfenced element to encode the
expression f(x + y):
<mrow>
<mi>f</mi>
<mfenced>
<mrow>
<mi>x</mi>
<mo>+</mo>
<mi>y</mi>
</mrow>
</mfenced>
</mrow>
By adding an mstyle element, we can set the color of the function
argument, so that the expression f(x + y)
will appear in red:
<mrow>
<mi>f</mi>
<mfenced>
<mstyle color='#ff0000'>
<mrow>
<mi>x</mi>
<mo>+</mo>
<mi>y</mi>
</mrow>
</mstyle>
</mfenced>
</mrow>
Here is a sample encoding showing the use of the mroot and mfrac
elements to encode
<mroot>
<mrow>
<mn>1</mn>
<mo>-</mo>
<mfrac>
<mi>x</mi>
<mn>2</mn>
</mfrac>
</mrow>
<mn>3</mn>
</mroot>
Finally, lets look at a more substantial example, like the quadratic formula:
A very careful encoding might look like this:
Markup:
<mrow>
<mi>x</mi>
<mo>=</mo>
<mfrac>
<mrow>
<mrow>
<mo>-</mo>
<mi>b</mi>
</mrow>
<mo>±</mo>
<msqrt>
<mrow>
<msup>
<mi>b</mi>
<mn>2</mn>
</msup>
<mo>-</mo>
<mrow>
<mn>4</mn>
<mo>⁢</mo>
<mi>a</mi>
<mo>⁢</mo>
<mi>c</mi>
</mrow>
</mrow>
</msqrt>
</mrow>
<mrow>
<mn>2</mn>
<mo>⁢</mo>
<mi>a</mi>
</mrow>
</mfrac>
</mrow>
Notice that the plus/minus sign is given by a special named entity ±.
Also, notice that another named entity ⁢ has also been
inserted. This entity doesn't display in print, but here we have added it to
facilitate voice synthesis and heuristic evaluation by computer algebra systems.
Whether or not you want to go to the trouble of adding extra grouping and
invisible characters will depend on the purpose of your document, and what
audience you want to reach.
Script Schemata
Superscripts and subscripts are ubiquitous in mathematical notation, and
MathML contains seven layout elements for different kinds and combinations of
scripts. Here are brief descriptions:
- <msub> base script </msub>
<msup> base script </msup>
-
- The msub and msup elements expect two children, which are
displayed as a base, and a sub- or superscript.
-
- <msubsup> base subscript superscript </msubsup>
-
- This element puts both a subscript and a superscript on the same base.
This is usually preferable to first attaching one and then the other with
the msub and msup elements individually, since then the
scripts are not vertically aligned.
- <munder> base script </munder>
<mover> base script </mover>
-
- The munder and mover elements expect two children, which are
displayed as a base, and a under- or overscript. A common use of these
schemata are to attach accents like bars and tildes to a base. However,
since accents are typeset closer to the base than other expressions, it is
necessary to set the accent or accentunder attributes to
"true" in this case.
-
- <munderover> base underscript overscript
</munderover>
-
- This element attaches both an under- and and overscript on a base. This is
particularly useful for positioning limits around a summation sign or
similar large operator. The operator dictionary typically sets the movablelimits
attribute to "true" on mo elements which contain these
large operators. Renderers like WebEQ use this attribute to determine
whether munderover should display the limits as under- and
overscripts or normal sub- and superscripts. By default, limits are
displayed above and below when an expression is displayed by itself, and in
the sub/super script positions when the expression is in a line of text.
-
- <mmultiscripts> base sub1 sup1 ... [<mprescripts/>
psub1 psup1 ...] </mmultiscripts>
-
- This element is used to place tensor indicies around a base expression. If
you don't already know what tensor indicies are, the basic idea is that the mmultiscripts
element can be used to put multiple columns of scripts on a base. It can
even attach columns of "prescripts" to a base.
Examples
We begin with a somewhat artificial example which shows the difference between
nested msub and msup elements and a single msubsup:
<mrow>
<msup>
<msub>
<mi>x</mi>
<mn>1</mn>
</msub>
<mi>α</mi>
</msup>
<mo>+</mo>
<msubsup>
<mi>x</mi>
<mn>1</mn>
<mi>α</mi>
</msubsup>
</mrow>
Our second example shows how one can control movable limits on large
operators, using an mstyle construction:
<mrow>
<mstyle displaystyle='true'>
<munderover>
<mo>sum</mo>
<mrow>
<mi>i</mi>
<mo>=</mo>
<mn>1</mn>
</mrow>
<mi>&infty;</mi>
</munderover>
<msup>
<mi>x</mi>
<mi>i</mi>
</msup>
</mstyle>
<mo>+</mo>
<mstyle displaystyle='false'>
<munderover>
<mo>sum</mo>
<mrow>
<mi>i</mi>
<mo>=</mo>
<mn>1</mn>
</mrow>
<mi>&infty;</mi>
</munderover>
<msup>
<mi>x</mi>
<mi>i</mi>
</msup>
</mstyle>
</mrow>
A final example illustrates the use of the accent attribute:
<mrow>
<mover>
<mi>G</mi>
<mo>&hat;</mo>
</mover>
<mo>+</mo>
<mover accent='true'>
<mi>G</mi>
<mo>&hat;</mo>
</mover>
<mo>+</mo>
<mover accent='false'>
<mi>G</mi>
<mo>&hat;</mo>
</mover>
</mrow>
Tables
MathML tables are a lot like HTML tables, except they have substantially more
attributes for controlling math-specific layout behaviors. Although the
attributes can get complicated, the basic usage is simple; a mtable
element contains any number of mtr table row elements, and mtr
elements contain any number of mtd table data cells.
- <mtable> row1 ... </mtable>
-
- The mtable element accepts a number of attributes for controling
how that table is laid out. The rowalign and columnalign
attributes can be used to determine how the entries in rows and columns
should be aligned, e.g. "center", "left",
"top", etc. The rowlines, columnlines and frame
attributes can be used to draw separator lines. rowspacing, columnspacing,
equalrows, and equalcolumns determine the spacing between rows
and columns.
-
- <mtr> cell1 ... </mtr>
-
- The attributes of the mtr element are basically the same as the row
related attributes of mtable, but they only apply to that specific
row and not the whole table.
-
- <mtd> child1 ... </mtd>
-
- The mtd element accepts a number of the table attributes, just like
the mtr element, which can be used to over-ride values for one cell.
It also has two special attributes, rowspan and columnspan,
which can be used to make one cell span several rows or columns. This is
very useful for table headings.
Examples
Here is the markup for a simple matrix:
<mrow>
<mi>A</mi>
<mo>=</mo>
<mfenced open='['; close=']'>
<mtable>
<mtr>
<mtd><mi>x</mi></mtd>
<mtd><mi>y</mi></mtd>
</mtr>
<mtr>
<mtd><mi>z</mi></mtd>
<mtd><mi>w</mi></mtd>
</mtr>
</mtable>
</mfenced>
</mrow>
Next Steps
This section contains a lot of information to absorb. Remember the highlights:
- Presentation elements can be thought of as abstractions of typesetting layout
boxes. Each element represents a sort of "smart template" for
laying out subexpressions in a certain way, such as a fraction or a row.
- All character data (including entity references) must be wrapped in a
token element, such as mi, mn, and mo, which determines
how it will display.
- In addition to a number of general layout elements like mrow and msqrt,
there are families of elements for handling scripts and tables.
- The are many attributes which can be used with presentation elements.
Default values are inherited from parent element to child element.
Attributes set directly in an element's begin tag override inherited values.
The defaults can be modified by using the mstyle element.
Now that you are acquainted with presentation mark-up, in the final section, Containers
and Operators, we examine MathML content mark-up.
|