Mathematics on the Web:
MathML and MathType

Paul Topping pault@dessci.com
Design Science, Inc. http://www.mathtype.com
January 21, 1999

Abstract

Currently, there is no effective way of expressing standard mathematical notation in Web pages. Equations can be displayed as GIF images but printing is poor, pages can download slowly, and they don't adapt to the browser user's font choices.

MathML is a potential solution to the problem. As of April 1998, MathML is a Recommendation by the W3C (World Wide Web Consortium) for the representation of mathematics. It is based on XML (Extensible Markup Language), a successor to HTML (Hypertext Markup Language), the language of the Web. MathML can be used to express both the presentation of mathematics and its meaning (through high school level mathematics). MathML is human-readable but designed to be written by software, rather than humans.

MathType 4.0 can generate MathML for use in authoring Web pages with mathematics. It will do so via a new translator mechanism. Translators are defined using a simple language and may be customized by the end user. Several MathML translator definition files are supplied with MathType 4.0 and will produce MathML presentation tags.

With support for XML/MathML by the major browser vendors and authoring tool suppliers, MathML will be a good mechanism for bringing mathematics to the Web.

Table of Contents

Mathematics on the Web

Background

Standard mathematical notation is used by millions of educators, students, engineers, scientists, and businessman around the world. It is the language of science. Although almost all modern word processing software provides support for the creation and editing of mathematical notation, there is virtually no support for it in Web page authoring software. The main reason for this lack is that HTML (Hypertext Markup Language --- the language used to define Web pages) provides no way of expressing math notation.

Today, most Web pages that include math are created by adding links to GIF images of equations. GIF is the Graphics Interchange Format and is the standard image file format for line art (as opposed to JPEG, which is best for photographic images) on the Web. There are several tools available for creating equations as GIF images, including our MathType 3.5 product.

However, GIF images of equations are far from ideal and have several disadvantages:

The History of Math on the Web

It has long been recognized by the designers of the World Wide Web that the right way to support mathematical notation is to make it part of the Web page language, HTML. Some years ago, Dave Raggett of the W3C (World Wide Web Consortium, the organization responsible for creating and disseminating the standards that define the Web) proposed an extension of HTML that would allow math to be expressed. For complicated reasons, the proposal was never accepted. Since then, two things have happened relevant to math on the Web:

XML and MathML

XML

XML brings the power of SGML (Standard Generalized Markup Language) to the Web. SGML was invented to solve problems that governments and other large institutions were having managing the large volumes of textual data that they have to deal with. HTML was actually designed using some of the ideas that originated with SGML and its predecessors. Its designers did not follow all the guidelines dictated by the SGML approach as they were not important in the early days of the Web. Now that the Web is starting to mature, the advantages of the SGML approach are beginning to be appreciated by the Web community. Luckily, as HTML has many of the features of SGML already, moving to XML will be easier than it would be otherwise.

XML was developed as a successor to HTML with the following advantages:

While a detailed description of XML is beyond the scope of this article (see [2] and [3] for good introductions), a simple example might give you the main idea. In HTML, one might show the author of a document as bold text by surrounding it with bold tags as, <b>John Q. Public</b>. In XML, you might surround the name with "author" tags as, <author>John Q. Public</author>. The formatting of the name would be specified by associating font, size, and character style information with the author tag via a style sheet mechanism (currently CSS [4], eventually XSL [5]). The important difference between the HTML and XML ways of handling the author's name is that XML captures the meaning of the chunk of text. One obvious application of this is in searching. With XML documents, one could search for all documents with a given author. With HTML, the best you could hope for would be to find all documents that contain the author's name. This would include documents that simply reference the author's work.

As both the major Web browser manufacturers, Microsoft and Netscape, have pledged support for XML, it is destined to become an important World Wide Web standard. Several useful projects have already been based on XML. Peter Murray-Rust's Chemical Markup Language [6] can be used to capture chemical structures. Microsoft has based their Channel Definition Format [7] on XML, allowing Web data to be "broadcast" to your browser. Others, including MathML, are in the works.

MathML

The MathML specification was written by the W3C Math Working Group [1]. In April 1998, it was raised to Recommendation status by the W3C. MathML has as its main goals:

  • encode mathematical material suitable for teaching and scientific communication at all levels.
  • encode both mathematical notation and mathematical meaning
  • MathML is intended to be used to both present mathematical notation and as a medium of exchange between scientific and mathematical software. Toward that end, MathML defines a set of XML elements and attributes (together called markup) that fall into two categories: presentation markup and content markup. Presentation markup is intended to describe mathematical expressions from a two-dimensional layout point-of-view, whereas content markup is intended to capture the meaning of the mathematics.

    Because the body of mathematical knowledge and meaning is constantly expanding, it would be impossible to capture the meaning of all mathematics with 50 MathML elements and their attributes. In order to keep the scope of content markup down to a reasonable size, the designers of MathML have restricted the mathematics that it attempts to cover to high school level mathematics. This is probably adequate to express most of the mathematics for which it is practical to exchange between computer programs that are going to generate and/or accept mathematical equations and calculate using them.

    For uses where expressing mathematical meaning is not important or not practical, MathML has presentation markup. Much of mathematical notation is ambiguous unless interpreted by human authors and readers and even then with respect to some sub-field of mathematics or science. For example, a bar over a letter might mean the inverse of some signal in electronics, whereas in other areas it might signify the value of the variable in the last step of some iterative algorithm. Presentation markup has its immediate goal to describe mathematical notation just well enough for a Web browser (or an add-on software module to a Web browser) to display it.

    Below is a simple example of MathML's presentation markup for the following simple equation:

    x2 + 4x + 4 = 0

    The presentational tags generally start with "m" and then use "o" for operator "i" for identifier "n" for number, and so on. The "mrow" tags are to do with organization into horizontal groups.

    <mrow>
      <mrow>
            <msup> <mi>x</mi> <mn>2</mn> </msup> <mo>+</mo>
                <mrow>
                  <mn>4</mn>
                  <mo>&invisibletimes;</mo>
                  <mi>x</mi>
                </mrow>
        <mo>+</mo>
        <mn>4</mn>
      </mrow>
        <mo>=</mo>
        <mn>0</mn>
    </mrow>
    

    Although this may seem verbose, remember that it is not intended that humans type this language. Instead, it is expected that it will be created using software tools like MathType.

    MathML's Promise

    Once MathML becomes more of a reality (i.e. supported in browsers, tools for creating/editing, support in calculation applications, etc.), people will be able to use it to create some wonderful applications:

    Technical documents

    Proper browser support for MathML will allow technical documents, such as journal articles, to be created as web pages. These will be better than those produced with current methods such as PDF (Adobe Acrobat), IBM Techexplorer, Mathematica Reader, etc. as these programs take over the entire browser window. With MathML, math can be copied from such web pages into the user's own work and to be used as the basis for further calculation and analysis.

    "Live" Mathematics

    It will be possible to create web pages that implement fancy calculators that show mathematical expressions in standard math notation. Other pages can demonstrate math, science, and engineering concepts where the user plugs in numbers and mathematical expressions, clicks a button, and sees the results presented graphically.

    Teaching Tools

    Teachers can prepare tests as web pages, making use of any of the techniques outlined above. As a counter to cheating, numbers and variables can be changed algorithmically in order to present a slightly different test to each student, while still distinguishing right answers from wrong ones.

    Support for MathML in MathType 4.0

    MathType 4.0's Translators

    We at Design Science are working on a major new version of our popular MathType software, the Windows version to be released in February of 1999. One of its most important new features is a powerful translator mechanism. Earlier versions of MathType have had built-in conversion to the TeX language, a powerful but hard-to-use typesetting system for technical documentation. In MathType 4.0, we have extended this capability to allow the translation of its equations into many other languages. The translation process is controlled by a "translator definition file", a text file containing simple translation commands. A MathType installation allows for any number of translator definition files, giving several powerful advantages:

    MathType 4.0 and MathML

    MathType 4.0 includes several MathML translators, one for each of the web browser plug-ins that are currently available:

    All of these MathML translators are much the same. They differ chiefly in the "wrapper" code required by their corresponding plug-ins. Eventually, it is expected that MathType will ship with only one MathML translator.

    Although MathML is now a W3C Recommendation, MathML support is somewhat experimental at this point for several reasons:

    We at Design Science and the W3C Math Working Group (Math WG) are working at solutions to these problems. To keep up-to-date on MathML support, visit the Math WG's page at   http://www.w3.org/Math/.

    Design Science will be updating its translators as MathML support in web browsers matures. Although MathType 4.0 will not contain a translator that can convert arbitrary MathML into MathType equations, it will be able to edit MathML material generated by its MathML translator. Eventually, we plan to support two-way conversion of both presentation and content markup.

    Web Browser Support for MathML

    Limitations of Current Web Browser Plug-ins

    Although there are browser plug-ins available that will display MathML (see MathType 4.0 and MathML), they really do not provide a completely satisfactory solution to displaying MathML in web pages. Each of them has one or more of the following limitations:

    Hope for the Future

    As of this writing, the makers of the most popular web browsers, Microsoft and Netscape, are devoting some time and energy to making it possible to add MathML support to their next-generation, version 5.0 browsers. Once this is complete, in theory, software developers will be able to add much more complete and powerful support for MathML into these browsers. We at Design Science expect this to happen by the end of 1999.

    On a somewhat separate front, there are several technologies being worked on by various W3C groups for which the major browser makers have pledged support:

    When (and if) some or all of the above becomes a reality, it should be possible for software developers to add MathML support to browsers in a completely standard adherent way. Then it will be simply a matter of waiting for the promised support for these standards to materialize.

    References

    1. W3C's HTML-Math Working Group, http://www.w3.org/Math/
    2. Presenting XML, Richard Light, 1997, Sams.net Publishing (ISBN 1-57521-334-6)
    3. XML: Principles, Tools, and Techniques, Dan Connolly (ed.), 1997, O'Reilly and Assoc., Inc. (ISBN 1-56592-349-9)
    4. Cascading Style Sheets (CSS), http://www.w3.org/TR/REC-CSS1
    5. Extensible Style Language (XSL), http://www.w3.org/Submission/1997/13/
    6. Chemical Markup Language (CML), http://xml.coverpages.org/cml.html
    7. Channel Definition Format (CDF ), http://en.wikipedia.org/wiki/Channel_Definition_Format.
    8. "HTML-Math", Robert R. Miner and Patrick D. F. Ion, article in [3].