NLPUtils Morphological Parser
Irregular Forms XML Specification

Cody Boisclair, November–December 2007


One feature of the NLPUtils Morphological Parser is that it allows one to define additional irregular forms via an XML file specified by MorphParser.IrregularsFilePath.

Here is a brief summary of the format expected in that XML file:


An example XML document that defines the irregular noun plurals "otaku" and "paparazzi":

<IrregularForms>
    <token spell="otaku">
        <parse>
            <morph spell="otaku" cat="noun" type="stem" />
        </parse>
        <parse>
            <morph spell="otaku" cat="noun" type="stem" />
            <morph spell="s" cat="noun" type="feature" />
        </parse>
    </token>
    <token spell="paparazzi">
        <parse>
            <morph spell="paparazzo" cat="noun" type="stem" />
            <morph spell="s" cat="noun" type="feature" />
        </parse>
    </token>
</IrregularForms>

If you have any comments, suggestions, or potential improvements, don't hesitate to drop me a line at codemanb@uga.edu.