rita.support
Class PlingStemmer

java.lang.Object
  extended by rita.support.PlingStemmer
All Implemented Interfaces:
RiStemmerIF

public class PlingStemmer
extends java.lang.Object
implements RiStemmerIF

This class is closely based on the PlingStemmer stemmer implementation included in the Java Tools pacakge (see http://mpii.de/yago-naga/javatools). It is licensed under the Creative Commons Attribution License (see http://creativecommons.org/licenses/by/3.0). Stems an English noun (plural or singular) to its singular form, including irregular forms, e.g., "firemen"->"fireman" and "appendices"->"appendix" There are a number of word forms that can either be plural or singular. Examples include "physics" (the science or the plural of "physic" (the medicine)), "quarters" (the housing or the plural of "quarter" (1/4)) or "people" (the singular of "peoples" or the plural of "person"). In these cases, the stemmer assumes the word is a plural form and returns the singular form. The methods isPlural, isSingular and isPluralAndSingular can be used to differentiate the cases.

The PlingStemmer uses material from WordNet.


Field Summary
static java.util.Set<java.lang.String> category00
          Words that do not have a distinct plural form (like "atlas" etc.)
static java.util.Set<java.lang.String> categoryCHE_CHES
          Words that change from "-che" to "-ches" (like "brioche" etc.), listed in their plural forms
static java.util.Set<java.lang.String> categoryEX_ICES
          Words that change from "-ex" to "-ices" (like "index" etc.), listed in their plural forms
static java.util.Set<java.lang.String> categoryICS
          Words that end with "-ics" and do not exist as nouns without the 's' (like "aerobics" etc.)
static java.util.Set<java.lang.String> categoryIE_IES
          Words that change from "-ie" to "-ies" (like "auntie" etc.), listed in their plural forms
static java.util.Set<java.lang.String> categoryIS_ES
          Words that change from "-is" to "-es" (like "axis" etc.), listed in their plural forms Borrowed from the PlingStemmer stemmer implementation included in the Java Tools pacakge (see http://mpii.de/yago-naga/javatools).
static java.util.Set<java.lang.String> categoryIX_ICES
          Words that change from "-ix" to "-ices" (like "appendix" etc.), listed in their plural forms Borrowed from the PlingStemmer stemmer implementation included in the Java Tools pacakge (see http://mpii.de/yago-naga/javatools).
static java.util.Set<java.lang.String> categoryO_I
          Words that change from "-o" to "-i" (like "libretto" etc.), listed in their plural forms Borrowed from the PlingStemmer stemmer implementation included in the Java Tools pacakge (see http://mpii.de/yago-naga/javatools).
static java.util.Set<java.lang.String> categoryOE_OES
          Words that change from "-oe" to "-oes" (like "toe" etc.), listed in their plural forms
static java.util.Set<java.lang.String> categoryON_A
          Words that change from "-on" to "-a" (like "phenomenon" etc.), listed in their plural forms Borrowed from the PlingStemmer stemmer implementation included in the Java Tools pacakge (see http://mpii.de/yago-naga/javatools).
static java.util.Set<java.lang.String> categorySE_SES
          Words that end in "-se" in their plural forms (like "nurse" etc.) Borrowed from the PlingStemmer stemmer implementation included in the Java Tools pacakge (see http://mpii.de/yago-naga/javatools).
static java.util.Set<java.lang.String> categorySSE_SSES
          Words that change from "-sse" to "-sses" (like "finesse" etc.), listed in their plural forms
static java.util.Set<java.lang.String> categoryU_US
          Words that change from "-u" to "-us" (like "emu" etc.), listed in their plural forms
static java.util.Set<java.lang.String> categoryUM_A
          Words that change from "-um" to "-a" (like "curriculum" etc.), listed in their plural forms Borrowed from the PlingStemmer stemmer implementation included in the Java Tools pacakge (see http://mpii.de/yago-naga/javatools).
static java.util.Set<java.lang.String> categoryUS_I
          Words that change from "-us" to "-i" (like "fungus" etc.), listed in their plural forms Borrowed from the PlingStemmer stemmer implementation included in the Java Tools pacakge (see http://mpii.de/yago-naga/javatools).
static java.util.Map<java.lang.String,java.lang.String> irregular
          Maps irregular Germanic English plural nouns to their singular form
 
Constructor Summary
PlingStemmer()
           
 
Method Summary
 boolean isPlural(java.lang.String s)
          Tells whether a noun is plural.
 boolean isSingular(java.lang.String s)
          Tells whether a word form is singular.
 boolean isSingularAndPlural(java.lang.String s)
          Tells whether a word form is the singular form of one word and at the same time the plural form of another.
static void main(java.lang.String[] argv)
          Test routine
 java.lang.String stem(java.lang.String s)
          Stems an English noun
 
Methods inherited from class java.lang.Object
equals, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Field Detail

categorySE_SES

public static java.util.Set<java.lang.String> categorySE_SES
Words that end in "-se" in their plural forms (like "nurse" etc.) Borrowed from the PlingStemmer stemmer implementation included in the Java Tools pacakge (see http://mpii.de/yago-naga/javatools).

Invisible:

category00

public static java.util.Set<java.lang.String> category00
Words that do not have a distinct plural form (like "atlas" etc.)


categoryUM_A

public static java.util.Set<java.lang.String> categoryUM_A
Words that change from "-um" to "-a" (like "curriculum" etc.), listed in their plural forms Borrowed from the PlingStemmer stemmer implementation included in the Java Tools pacakge (see http://mpii.de/yago-naga/javatools).

Invisible:

categoryON_A

public static java.util.Set<java.lang.String> categoryON_A
Words that change from "-on" to "-a" (like "phenomenon" etc.), listed in their plural forms Borrowed from the PlingStemmer stemmer implementation included in the Java Tools pacakge (see http://mpii.de/yago-naga/javatools).

Invisible:

categoryO_I

public static java.util.Set<java.lang.String> categoryO_I
Words that change from "-o" to "-i" (like "libretto" etc.), listed in their plural forms Borrowed from the PlingStemmer stemmer implementation included in the Java Tools pacakge (see http://mpii.de/yago-naga/javatools).

Invisible:

categoryUS_I

public static java.util.Set<java.lang.String> categoryUS_I
Words that change from "-us" to "-i" (like "fungus" etc.), listed in their plural forms Borrowed from the PlingStemmer stemmer implementation included in the Java Tools pacakge (see http://mpii.de/yago-naga/javatools).

Invisible:

categoryIX_ICES

public static java.util.Set<java.lang.String> categoryIX_ICES
Words that change from "-ix" to "-ices" (like "appendix" etc.), listed in their plural forms Borrowed from the PlingStemmer stemmer implementation included in the Java Tools pacakge (see http://mpii.de/yago-naga/javatools).

Invisible:

categoryIS_ES

public static java.util.Set<java.lang.String> categoryIS_ES
Words that change from "-is" to "-es" (like "axis" etc.), listed in their plural forms Borrowed from the PlingStemmer stemmer implementation included in the Java Tools pacakge (see http://mpii.de/yago-naga/javatools).

Invisible:

categoryOE_OES

public static java.util.Set<java.lang.String> categoryOE_OES
Words that change from "-oe" to "-oes" (like "toe" etc.), listed in their plural forms


categoryEX_ICES

public static java.util.Set<java.lang.String> categoryEX_ICES
Words that change from "-ex" to "-ices" (like "index" etc.), listed in their plural forms


categoryU_US

public static java.util.Set<java.lang.String> categoryU_US
Words that change from "-u" to "-us" (like "emu" etc.), listed in their plural forms


categorySSE_SSES

public static java.util.Set<java.lang.String> categorySSE_SSES
Words that change from "-sse" to "-sses" (like "finesse" etc.), listed in their plural forms


categoryCHE_CHES

public static java.util.Set<java.lang.String> categoryCHE_CHES
Words that change from "-che" to "-ches" (like "brioche" etc.), listed in their plural forms


categoryICS

public static java.util.Set<java.lang.String> categoryICS
Words that end with "-ics" and do not exist as nouns without the 's' (like "aerobics" etc.)


categoryIE_IES

public static java.util.Set<java.lang.String> categoryIE_IES
Words that change from "-ie" to "-ies" (like "auntie" etc.), listed in their plural forms


irregular

public static java.util.Map<java.lang.String,java.lang.String> irregular
Maps irregular Germanic English plural nouns to their singular form

Constructor Detail

PlingStemmer

public PlingStemmer()
Method Detail

isPlural

public boolean isPlural(java.lang.String s)
Tells whether a noun is plural.


isSingular

public boolean isSingular(java.lang.String s)
Tells whether a word form is singular. Note that a word can be both plural and singular


isSingularAndPlural

public boolean isSingularAndPlural(java.lang.String s)
Tells whether a word form is the singular form of one word and at the same time the plural form of another.


stem

public java.lang.String stem(java.lang.String s)
Stems an English noun

Specified by:
stem in interface RiStemmerIF

main

public static void main(java.lang.String[] argv)
                 throws java.lang.Exception
Test routine

Throws:
java.lang.Exception