Refactoring tools are incredibly popular in the programming community. Most modern programming environments provide refactoring tools of various degrees of sophistication.
But what are refactoring tools? In short, they are tools that modify programs without changing their runtime semantics. In other words, refactoring tools must not introduce an observable difference in the execution of the program. They help abstract common code, change variable names, rename procedures or methods, etc.
Refactoring tools help developers make repetitive code restructuring tasks that would otherwise be highly error-prone if done by hand. Without such tools, even the simplest form of refactoring - renaming a variable in a file - can easily cause unexpected problems if done using a simple search and replace. Now imagine renaming a public method in an object-oriented language, where the method can be invoked from many different places in the whole project source code…
Refactoring applied to speech recognition grammars
Similarly to programming language refactoring tools, grammars refactoring tools help modify grammars without changing the language they accept and the values they return when interpreting sentences. There are a number of common tasks involved when writing speech recognition grammars that can benefit from refactoring tools. Here are a few:
- Rule renaming. Naming things is hard. I am a programmer myself and I always find it hard to come up with the most precise name for a class, a variable, a procedure, or method. Naming grammar rules is just as hard. The programming environment should make it easy to rename a rule when we find a better name, in such a way that we don’t break the grammar. In other words, the renaming tool must rename the rule definition, as well as all its references (and potentially the
root header). But just as important, semantic tags must also be taken into account when renaming a rule. How many times have you forgotten to modify the semantic tag after renaming a rule? A proper refactoring tool must therefore ensure that references to the rule in all semantic tags be modified as well.
- Slot renaming. Likewise, slot names are often renamed. A renaming tool must ensure that all references in the defining rule as well as the references in other rules be changed at once.
- Rule extraction. Another common task for the grammar writer is the extraction of a rule expansion to create a new rule. Grammars are often built incrementally. The grammar writer begins by coding a few rules, discovers potential for reuse, and creates new rules encapsulating these reusable parts. If the extracted parts contain semantic tags, it can be tricky (and highly error-prone) to modify them by hand and making sure that the semantic slots computed by the new rule are properly propagated to the referencing rule.
Challenges
SRGS grammars offer a number of important challenges with respect to refactoring tools:
- They combine two different languages, namely the SRGS language itself for expressing the valid sequences of words, and the semantic tag language. These two languages have very different semantics.
- The most common semantic tag languages are based on ECMAScript, a highly dynamic scripting language. The refactoring tools must thus understand the ECMAScript language and its various constructs to properly do their job.
- The semantic tag language can vary from one ASR engine to the other.
Refactoring support in NuGram IDE
The refactorings described above are all supported by NuGram IDE. Moreover, they are aware of the grammar semantic tag language declared by the grammar - they behave differently whether the tag-format header is semantics/1.0 or swi-semantics/1.0 (the Nuance tag format is not yet supported). This, BTW, is the kind of thing that cannot be done by a generic XML editor.
To rename a rule, put the cursor on a rule name (the definition or a reference), and press Alt-Shift-R. You should see something like:

As you can see, all the references that must be changed at once are surrounded by a gray rectangle, even in the semantic tags.
To rename a semantic slot, put the cursor on a reference to the slot and press the same key sequence (Alt-Shift-R):

All the definitions and references will be modified at once when you change the slot’s name (here the semantic tags are in the swi-semantics/1.0 tag format). Note that all the references to the slot will be changed in the other rules as well, not only in the defining rule.
Finally, to extract an expansion in a new rule, simply select the expansion:

and type Alt-Shift-T:

You see that a new private rule has been created (the default visibility for newly created rules can be configured in the preferences), and a new tag has also been created to propagate the slots returned by the new rule to the calling rule.
These were very simple examples. Consider this (somewhat contrived) rule:

If I want to rename the $digit local rule, should the tool also rename the rules.digit property? That’s not clear. If the rule $<special.abnf#digit> is matched, rules.digit will contain the semantic value returned by that rule. Otherwise, it will contain the semantic value returned by the last match to $digit. There is an ambiguity here. The same identifier may refer to two different things.
Fortunately, If I try to rename the $digit rule using NuGram IDE, it won’t blindly attempt to rename the slot. It will instead pop up the following dialog (click to enlarge):

Of course, in practice grammars are rarely that hairy and complex. But refactoring tools must be correct 100% of the time. Otherwise, people would not use them by fear of breaking their programs or grammars.
Finally, note that all NuGram IDE refactoring tools are not only available for plain ABNF grammars, but also for the dynamic extensions as well. It is possible to rename variables, rename macros, and extract macros.
If you think of other repetitive grammar-related tasks that could be automated that way, please let us know. We strongly believe in powerful tools that help make applications more robust!