This paper introduces a practical solution for dramatically enlarging the capabilities of an established parser, a task that presents substantial challenges. During the development of new procedures for SUDAAN®, a commercial statistical software package, we found the existing parser to be inadequate for new situations. Like many other parsers, the one in use could be characterized as a no-repair, no-guesswork, and no-backtracking look-ahead left-to-right LALR(1) parser [1, p. 300]. This paper describes how the parser was enhanced to handle extra syntax for sophisticated mathematical and logical expressions. The new parser adds a noncanonical parsing technique, along with a Shunting-Yard-style algorithm and other techniques as a second step after the original canonical LALR [2], resulting in a powerful and efficient two-level parsing approach. Adding a second step to the successful one-step parser offered a way to preserve existing, well-tested capabilities while adding capabilities for parsing more complex syntax.
Article