Updated: March 2, 2011
Design Science, Inc.
Modern document editing software packages (eg, word processors and web page editors) often provide access to a set of specialized tools or plug-in software components to perform specialized sub-tasks of the overall document-creation task. The advantage of this approach is that the best tool may be chosen for each job. The disadvantage is that the tools often do not integrate well with the document editor. This is particularly a problem for an equation editor plug-in.
A mathematical notation editor (equation editor) is a tool that, at first glance, would be easily implemented as a document editor plug-in. However, its integration with the document editor presents some unique problems. These are presented in this paper along with suggested solutions. A few of the problems are ones that absolutely must be dealt with in the document editor for equation editing to be even minimally useful to users that work with mathematics. We cannot stress the importance of this enough. The main reason this document was written was to make this point. We have marked these must-have features with (crucial).
The intended audience for this paper includes software designers and developers that create document editing systems and their plug-in mechanisms. It is our hope that the knowledge given here will help improve their products.
The ability to handle mathematical notation in a document editor (word processor, page layout, presentation, web content editor, blog editor, etc.) is a requirement in many areas of human endeavor: teaching of math, science, engineering, and social sciences, engineering and scientific research, economics, business. Mathematics is behind almost all modern technology and it all has to be documented, presented, and taught. Mathematics is actually a special kind of text so it will be present in a percentage of all textual communication regardless of the application or medium, at least insofar as our software applications can handle it of course.
Publishing documents containing mathematical notation has been considered difficult for a long time — probably as long as such documents have been published. Typesetters used to refer to it as "penalty copy" because they have to charge the customer extra for it. Therefore, it should be no surprise that specialized software tools have been created to edit mathematical notation.
Some document editors have built-in support for mathematical notation. However, this is always rudimentary. A analogous situation exists with painting and drawing features. Some document editors provide built-in support for creating drawings but they are fairly weak compared to standalone painting and drawing applications. We are firm believers in using the right tool for each job. Modern computer users work with a variety of applications. Those that work with math would like to use the same tool to create and edit math regardless of the context, just as an artist would like to work with Photoshop or Illustrator regardless of where the drawing or painting is going to be placed.
In this section, we provide a little background information on the relationship of math to the document that contains it and define some terms that are used in the rest of this paper.
At first glance, it may seem that the relationship between an equation and the document that contains it is similar to that of a picture or graphical image. However, the information that a document editor needs to know about a picture is minimal: perhaps its size and the colors it uses. The document editor may allow the user to change the size, position, frame, cropping, etc., but none of these actually requires knowledge of the picture it self. Equations, however, are much more intimately tied to the document.
Math notation is simply a special kind of formatted text. It is based on the font, size, and style chosen for the body text of the document in which it is found, and then adds special symbols, character styles, various alphabets (Greek, Hebrew), rules, and other notational structures. It is this special relationship that math has with the surrounding text that is at the root of the problems described in this paper. Whenever a decision needs to be made where the question arises, "How should this affect equations in the document?", the "math is just fancy text" rule should be followed or at least consulted.
Equations are normally positioned within a document two ways. The first is called "inline", as in "E = mc2" in the following paragraph:
A "display" equation is placed in its own paragraph and is often centered within the text column in which it appears:
Display equations are often numbered, with the number usually aligned with the column edge. Equations are usually referred to by number elsewhere in the document, just like figure numbers.
The author (or editor) usually chooses inline or display based on the height of the equation. If the inline equation would collide with the lines above and below, it either must be reformatted so as to avoid this or turned into a display equation. There may be other reasons to make an equation displayed. To be able to give it an equation number, for example.
It is important to note that the line spacing in a paragraph is never changed to accommodate the inline equation. The rule is: if it doesn't fit, change the equation so it does fit, or turn it into a display equation.
A typical technical paper might have 20 equations per page. A simple 10-page paper would have perhaps 200 equation files. This can cause problems when the document editor's plug-in mechanism is implemented with the expectation of one or two objects per page, with perhaps a dozen in the entire document. Along with the "math is just fancy text:" principle, this "many equations per document" principle affects many aspects of the document editor's design and is one of the most important for implementers to keep in mind.
In this section, we discuss challenges to the document editor design in the area of layout of the page and geometric relationships between the math and the rest of the text.
Properly handling equations (and equation editing plug-ins) presents many challenges to the designer of a document editing system. In this section, we describe the challenges and discuss possible solutions.
Inline equations always have a natural baseline and it must be aligned with that of the surrounding text, as shown here.
If the document editor positions the equation within the page via the equation's bounding box, the equation (or the equation editor) expresses the baseline position within the equation as some distance above the bottom edge of the box. There are some points worth noting:
The following screen capture from Microsoft Word shows a line containing an equation object followed by the same line where the equation was simply typed directly in Word.
As you can see, the spacing around the equation object is too large and does not match that of the non-equation line. In mathematical typesetting, the characters of an inline equation are spaced with respect to the surrounding text as if the equation was a single word. This is a prime example of the "math is just fancy text" principle.
The solution to this problem is for the equation editor to provide the document editor with the location of starting and ending character positions within the equation's bounding box:
The line should then be laid out as if the equation object were a single word.
Just as positioning of inline equations must treat the equation as a single word, so must the document editor's line-breaking process. Referring to the example in the last section, if the equation ends up at the end of a line, it is important that the comma immediately after it not wrap to the next line. Note that this is probably the opposite of what happens with an inline graphical image. This situation is made even more difficult by the possibility that the user places the comma inside the equation, rather than outside as in the example.
The solution to this problem is again solved by applying the "math is just fancy text" principle. Line layout should treat the equation as it would a single word in the line.
Most word processors try to automatically modify the line spacing in a paragraph to accommodate the height and/or depth of an inline object. This results in uneven line spacing like this:
The situation is made worse by the fact that these programs are conservative in deciding whether the object overlaps with the adjacent lines. The test is most likely performed by taking the object's bounding box, which already has a little padding on it, and seeing if it intersects with the descent of the line above or the ascent of the line below, where ascent and descent are the maximum defined for all the fonts used in the lines in question.
With math, the user wants to take responsibility for line spacing. A document editor might alert the user to the possible problem, but it shouldn't automatically change the line spacing. See Inline vs. Display Equations above, for more discussion.
Display equations often occur in bunches, where they are aligned by some point within each equation, often at equal signs or other relational operators:
This functionality could be provided by adding an equation-specific tab-stop type (or perhaps some kind of object alignment tab-stop so that other kinds of objects might participate in such alignments). The equation editor would simply supply the position of the alignment point relative to the equation's bounding box. This facility could be generalized such that an equation, or any embedded object, would provide multiple reference points within itself that the document editor could use for alignment.
As mentioned earlier, it is quite common to have numbers applied as captions to display equations:
Such numbers are often positioned flush-right, but sometimes flush-left. They must also have their baselines aligned with that of the corresponding equation. Other than these issues, a document editor's caption facility is usually sufficient for equation numbering.
It is common to desire to have a tall, multi-line display equation split across a page boundary. This is similar to how long tables must be handled. This can be implemented by having the equation communicate to the document editor a set of vertical positions within the equation's bounding box at which a page-break can be safely performed.
When a wide equation is forced to fit into a narrow column, it should be reformatted to fit the column width. The world of mathematical typesetting has some guidelines for how this is to be done. For example, a long polynomial should be split at its plus and minus signs, never within a term. Optionally, the ending plus/minus of one line is repeated at the start of the next line, as in the following example:
This requires considerable negotiation between the equation object and the document editor. Basically, during layout the document editor informs the equation object of its column width along with other ambient information. With this knowledge, the equation object lays out the equation and responds with its bounding box.
Small equations are sometimes used to label the axes of a graph or the columns of a chart. Some high-end graphing programs allow embedding of equations in a graph, much like document editors allow embedding of equations in text. However, this can not be relied upon and, even if such a capability is available, it may not be desirable. Instead, the user might want to place the equation relative to the chart, perhaps by simply dragging it into place. Such an ability might be considered a third equation positioning mode, after inline and display.
Because math is just fancy text, the formatting of an equation depends heavily on the properties of its surrounding text. For example, if a document paragraph is set in 10-point Times, the equation will also use 10-point Times (perhaps italics for variables, proportionally smaller sizes for subscripts and superscripts, etc.). Similarly, if the body text is dark gray on a light peach background, the same for equation text. There are several properties that the equation should inherit from the corresponding document properties at the place where the equation is inserted. Collectively, we call these "ambient properties".
When an equation is created, the equation editor will make use the document's ambient properties in order to define the corresponding equation properties. The ambient properties are:
Whenever the ambient properties change, the equation needs to be reformatted so that it can adjust to the changes. Their are two kinds of document-editing operation where such changes occur: (a) when equations (or a text selection containing equations) are moved from one place in the document to another (e.g. drag-and-drop or cut-and-paste) or (b) when the properties are changed (e.g. selecting a paragraph containing equations and changing its point size).
In this section, we'll consider how hosting an equation editor within a document editor affects the user interfaces of both pieces of software. As with most of this paper, the motivating principle is "math is just fancy text". Application of this principle to user interface design implies that the transition between typing normal text in a paragraph to typing math should be as smooth as possible. Similarly for other user interface activities.
Ideally, moving the insertion point across the boundary between normal text and equation text should be smooth. If the user hits the right arrow key when the insertion point is immediately to the left of an equation, the insertion point should move to just after the left-most character or element of the equation. There probably should be some sort of accompanying graphical indication of the transition into "equation editing mode" to indicate the availability of additional math-related keyboard shortcuts. If the user moves the insertion point in word chunks (Control-arrows in Windows), the whole equation may be skipped.
Depending on performance, appearance of math-editing toolbars, and other such considerations, it may make sense for insertion point entry into an embedded equation to be more explicit. For example, rather than having the right arrow key enter an equation, the user might be required to type Command-E to enter the equation.
However the transition is made, it needs to be smooth. Obviously, this is a somewhat aesthetic consideration. A good example of how NOT to do it, look to Microsoft Word. To edit an existing equation within the document, the user must double-click on it. This causes Equation Editor (or MathType) to be activated. Its toolbar is displayed, ready for the user to perform equation-editing operations. Unfortunately, Word hides its own toolbars and moves the document text upward to take up the space. This movement of the text is very distracting to the user. Ironically, Word hides its toolbars to make room for Equation Editor's own toolbars but, as its toolbars are floating, rather than docked at the top of the window like Word's, hiding them is completely unnecessary.
Most text editing of documents takes place at a viewing scale that is either 100% or close to it. Unfortunately, small features of equations (e.g. subscripts and superscripts, accents) are hard to read at that scale. MathType defaults to a viewing and editing scale of 200% and provides options that allow the user to zoom in even closer. It is unreasonable to expect the user to change viewing scale on the entire document just to work with an equation. Instead, the equation editor could pop up an editing window at a higher zoom level directly over the site of the equation within the document. Such a window should have a very thin border and, perhaps, a drop shadow to make it stand out against the document.
Immediately after leaving equation-editing mode, it is important that the document editor reformat the paragraph containing the equation. All of the equation's dimensions (bounding box, baseline, alignment points) may have been changed as a result of the equation editing session. If the equation (or some part of it) is aligned with an equation in another paragraph, multiple paragraphs may need to be updated.
Since an equation editor plug-in is expected to provide some user commands that apply to multiple equations in part or all of the document, it will need to be able to add its own user interface to that of the document editor. At a minimum, there needs to be a way for a new equation to be inserted. Applications whose plug-in mechanism is based on Windows OLE (Object Linking and Embedding) facility, such as Microsoft Office and many other applications, provide an Insert Object dialog that presents a list of possible object types to insert into the document. If this is the only way to insert an equation, the user will be forced to perform too many interactive steps. A shortcut must be provided.
Equation editors provide many different commands that can be used to create and edit equations. It is essential that users be able to make use of keyboard shortcuts to speed these operations. The equation editor should be free to implement its own, private system of keyboard shortcuts, independent of the document editor's. Of course, this implies that there be an explicit equation editing mode that the user must enter while editing equations. The operating system may provide this for free if all keystroke events are routed to an equation editor window when it is displayed on top of all other windows. However, it is important that the document editor not intercept all keyboard events, thereby preventing the equation editor's keyboard mechanism from working.
There are just two issues involving graphical rendering of equations that are of concern:
Equations are normally transparent. If a paragraph has a background color, so should the equation within it, by default.
Most document editors clip embedded graphics to their bounding box. Presumably, this is to prevent the graphics from marking outside its box. Depending on operating system and printer driver specifics, this can create a problem for equations. If the printer driver implements clipping at the character level, rather than the printer pixel level, entire characters can be clipped out of the printed output even though the pixels in the character's image would not go outside the bounding box. This is because character-level clipping is often performed by using a character bounding box that would enclose ALL characters in the font, rather than the true bounding box of the specific character.
Clipping to the equation's bounding box also clashes with techniques commonly employed by equation editors to overcome a document editor's inability to deal with equation kerning (see Horizontal Positioning of Inline Equations above) and line spacing (see Line Spacing above). The technique involves the equation editor "lying" by reporting a bounding box that is smaller than the box that encloses its pixels. This only works, of course, if the equation can draw outside the box.
Many page layout programs handle storage for the non-text components of a document (images, spreadsheets, etc.) by simply storing links to the original files within the document. This arrangement is certainly an efficient use of disk space and has some advantages. It allows any and all image editing programs to be used to edit an included image file without such programs needing to have a special relationship with the document editor, for example. The main disadvantage is that the user has to keep all the separate files together with the document.
With equations, keeping each in a separate file becomes unwieldy. (See Many Equations per Document above) Just coming up with meaningful names for all the files would be a pain. Instead, equations must be saved within the document itself using facilities that the document editor provides.
Although storing embedded objects within a document can be implemented any number of ways, it is perhaps useful to look at how Microsoft Office does it. Office applications use a Windows operating system facility called "structured storage". Structured storage is a scheme whereby an entire filing system is implemented within a single file. Each Microsoft Word document is a single file but the Word program can view it as a directory containing both files and sub-directories. The text of the document is stored in a "text stream" file, whereas embedded objects (equations, images, etc.) are individual files within an "object" directory — all within the single document file. Each of the embedded objects has an internal file name that is referenced in the text stream.
Equation editing and equation layout often involve user-specified preferences: font choices, math-specific spacing factors, user-defined keyboard shortcuts, and zoom level to name just a few. The scope of a particular preference value may be limited to a single equation, a document section (e.g. a chapter, all footnotes, etc.), the entire document, per-user, per-computer, per-LAN, or some combination of these things.
An equation editor can deal with the per-user, per-computer, and per-LAN preferences by storing them in preference files or an operating system facility such as the Windows Registry, just like any other software application. Per-equation preferences can be saved with the actual equation data, as described in the previous section. Per-document and per-document-section preferences present a problem, however. This requires that the document editor provide facilities for attaching arbitrary data to the entire document and, ideally, to individual parts of the document (sections, chapters, all footnotes, etc.).
Being able to store arbitrary metadata with a document, or with parts of a document, is becoming a requirement for document editors. Content management systems, indexing systems, and the like, all require associating data with a document beyond the simple title, author, and keywords storage often provided by document editors. Equation preferences can be stored using the document's metadata facilities. All that is required is for the document editor to make access to its metadata facilities available to its plug-in software components.
Formatting of equations with the level of quality demanded by book and journal publishers, may involve the control over many spacing factors, font sizes, and other settings. It would be inconvenient if the author had to set these explicitly every time a new document is begun. Most document editors provide a stylesheet and/or document template facility to help with the analogous problem for other document settings. It is highly desirable that equation settings be integrated into this facility.
Following the "math is just fancy text" principle, it should come as no surprise that it be desirable for equations to participate in some of the document editor's commands. Commands that might be affected include:
In fact, all document editor commands should be considered for possible participation by the equation editor plug-in.
In addition, if the document editor supports multiple, discontinuous selection within the document, it would be reasonable to provide a "select all equations" command so that style changes can be applied to all the equations in a document.
Most document editors provide facilities for reading document formats other than their native format. Some of these document formats might contain equation objects or mathematical material that could be converted into equation objects. For example, Microsoft Word documents may contain equation objects created using Equation Editor or MathType. It is highly desirable that document conversion software handle equations.
Document editor's import facilities are often implemented as a special kind of plug-in software called an "import filter". Import filters for popular document formats are usually part of the document editor product, but do not handle equations. It is usually also possible to create custom import filters in order to handle uncommon file formats.
Unfortunately, implementing a custom import filters is not a good solution to the problem of importing equations. Document conversion is a non-trivial process. It would be an extreme burden for the equation editor vendor to have to create a quality import filter from scratch. Instead, it would be much more practical to provide a mechanism whereby the processing of the standard import filter could be enhanced or overridden for equations only.
Once a document is created, modern document editors provide several forms of output from the document: The most important of these are:
It is important that the equation editor be involved in these output processes.
In modern document processing, imaging (drawing on screen or printer) is performed using these software layers:
Drawing of embedded document components, such as equations, is handled via one of two methods:
The first method allows for better quality output as the plug-in can probably make use of knowledge concerning the output device when drawing the equation. This can not be done with a metafile as the same metafile is probably used for display on the screen as well as all printers.
Font management is usually of concern to users of professional publishing systems. Such systems often have features that allow the user to:
As equations are font-intensive, and may use fonts that are not used elsewhere in the document, it is important that the equation editing plug-in be involved in these features. If the output technique involves a metafile, this can be done by the document editor by extracting the fonts used by the metafile. If the plug-in draws directly on the virtual drawing surface, it must be actively queried for a list of fonts used by each equation.
Without involving the equation editor plug-in in HTML and XML output, the document editor document editor will be limited to generating GIF or some other purely graphical output format. This may be acceptable for low-quality applications, but MathML is a better choice. MathML (see http://www.w3.org/Math) is an XML-based language for representing mathematics and has been an official Recommendation of the W3C (World Wide Web Consortium) since 1998. MathML, like other XML-based languages, expresses the structure and meaning of the mathematics. It is not practical for the document editor to generate MathML. Instead, it should call on the equation editor plug-in to do this conversion.
As you can see, there needs to be a very strong connection between the document editor and its equation editor plug-in. While we have attempted to list all the important considerations in this paper, there will always be details that can not be foreseen. These can only be addressed by a software design team that is aware of the role of mathematics in the document and of the requirements that underlie the specific issues named here. We hope that this document is a good starting point.