rita
Class RiStemmer

java.lang.Object
  extended by rita.RiObject
      extended by rita.RiStemmer
All Implemented Interfaces:
processing.core.PConstants, RiStemmerIF, RiConstants

public class RiStemmer
extends RiObject
implements RiStemmerIF

A simple set of stemmers for extracting base roots from a word by removing prefixes and suffixes. For example, the words 'run', 'runs', and 'running' all have "run" as their stem.

    String[] tests = { "run", "runs", "running" };
    RiStemmer stem = new RiStemmer(this);
    for (int i = 0; i < tests.length; i++)
      System.out.println(stem.stem(tests[i]));
 
This class provides a # of implementations, each specified by a type constant.
For example, to use the Lancaster (or Paice-Husk) algorithm instead of the Porter (the default), create the stemmer as follows:
 RiStemmer stem = new RiStemmer(this, LANCASTER_STEMMER);
 
For a comparison of the various algorithms, see
http://www.comp.lancs.ac.uk/computing/research/stemming/Links/algorithms.htm


Field Summary
static int LANCASTER_STEMMER
          Type constant for Lancaster stemmer
static int LOVINS_STEMMER
          Type constant for Pacie-Husk stemmer
static int PLING_STEMMER
          Type constant for Pling stemmer
static int PORTER_STEMMER
          Type constant for Porter stemmer
 
Fields inherited from interface rita.support.RiConstants
BEHAVIOR_COMPLETED, BOUNDING_BOX_ALPHA, BRILL_POS_TAGGER, EASE_IN, EASE_IN_CUBIC, EASE_IN_EXPO, EASE_IN_OUT, EASE_IN_OUT_CUBIC, EASE_IN_OUT_EXPO, EASE_IN_OUT_QUARTIC, EASE_IN_OUT_SINE, EASE_IN_QUARTIC, EASE_IN_SINE, EASE_OUT, EASE_OUT_CUBIC, EASE_OUT_EXPO, EASE_OUT_QUARTIC, EASE_OUT_SINE, ESS, FADE_COLOR, FADE_IN, FADE_OUT, FADE_TO_TEXT, FIRST_PERSON, FUTURE_TENSE, ID, LERP, LINEAR, MAXENT_POS_TAGGER, MINIM, MOVE, MUTABLE, PAST_TENSE, PHONEME_BOUNDARY, PHONEMES, PLURAL, POS, PRESENT_TENSE, SCALE_TO, SECOND_PERSON, SENTENCE_BOUNDARY, SINGULAR, SONIA, SPEECH_COMPLETED, STRESSES, SYLLABLE_BOUNDARY, SYLLABLES, TEXT, TEXT_ENTERED, THIRD_PERSON, TIMER, TIMER_COMPLETED, TIMER_TICK, TOKENS, UNKNOWN, WORD_BOUNDARY
 
Fields inherited from interface processing.core.PConstants
A, AB, ADD, AG, ALPHA, ALPHA_MASK, ALT, AMBIENT, AR, ARC, ARGB, ARROW, B, BACKSPACE, BASELINE, BEEN_LIT, BEVEL, BLEND, BLUE_MASK, BLUR, BOTTOM, BOX, BURN, CENTER, CENTER_DIAMETER, CENTER_RADIUS, CHATTER, CLOSE, CMYK, CODED, COMPLAINT, CONTROL, CORNER, CORNERS, CROSS, CUSTOM, DA, DARKEST, DB, DEG_TO_RAD, DELETE, DG, DIAMETER, DIFFERENCE, DILATE, DIRECTIONAL, DISABLE_ACCURATE_TEXTURES, DISABLE_DEPTH_SORT, DISABLE_DEPTH_TEST, DISABLE_OPENGL_2X_SMOOTH, DISABLE_OPENGL_ERROR_REPORT, DODGE, DOWN, DR, DXF, EB, EDGE, EG, ELLIPSE, ENABLE_ACCURATE_TEXTURES, ENABLE_DEPTH_SORT, ENABLE_DEPTH_TEST, ENABLE_NATIVE_FONTS, ENABLE_OPENGL_2X_SMOOTH, ENABLE_OPENGL_4X_SMOOTH, ENABLE_OPENGL_ERROR_REPORT, ENTER, EPSILON, ER, ERODE, ERROR_BACKGROUND_IMAGE_FORMAT, ERROR_BACKGROUND_IMAGE_SIZE, ERROR_PUSHMATRIX_OVERFLOW, ERROR_PUSHMATRIX_UNDERFLOW, ERROR_TEXTFONT_NULL_PFONT, ESC, EXCLUSION, G, GIF, GRAY, GREEN_MASK, HALF_PI, HAND, HARD_LIGHT, HINT_COUNT, HSB, IMAGE, INVERT, JAVA2D, JPEG, LEFT, LIGHTEST, LINE, LINES, LINUX, MACOSX, MAX_FLOAT, MAX_INT, MIN_FLOAT, MIN_INT, MITER, MODEL, MULTIPLY, NORMAL, NORMALIZED, NX, NY, NZ, OPAQUE, OPEN, OPENGL, ORTHOGRAPHIC, OTHER, OVERLAY, P2D, P3D, PATH, PDF, PERSPECTIVE, PI, platformNames, POINT, POINTS, POLYGON, POSTERIZE, PROBLEM, PROJECT, QUAD, QUAD_STRIP, QUADS, QUARTER_PI, R, RAD_TO_DEG, RADIUS, RECT, RED_MASK, REPLACE, RETURN, RGB, RIGHT, ROUND, SA, SB, SCREEN, SG, SHAPE, SHIFT, SHINE, SOFT_LIGHT, SPB, SPG, SPHERE, SPOT, SPR, SQUARE, SR, SUBTRACT, SW, TAB, TARGA, THIRD_PI, THRESHOLD, TIFF, TOP, TRIANGLE, TRIANGLE_FAN, TRIANGLE_STRIP, TRIANGLES, TWO_PI, TX, TY, TZ, U, UP, V, VERTEX_FIELD_COUNT, VW, VX, VY, VZ, WAIT, WHITESPACE, WINDOWS, X, Y, Z
 
Constructor Summary
RiStemmer()
           
RiStemmer(processing.core.PApplet p)
          Creates a default stemmer
RiStemmer(processing.core.PApplet p, int stemmerType)
           
 
Method Summary
 RiStemmerIF getStemmer()
          Returns the concrete stemmer (delegate) object that actually does the work
static void main(java.lang.String[] args)
           
 java.lang.String stem(java.lang.String word)
          Extracts base roots from a word by lower-casing it, then removing prefixes and suffixes.
 
Methods inherited from class rita.RiObject
dispose, getId, getPApplet, nextId
 
Methods inherited from class java.lang.Object
equals, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Field Detail

PLING_STEMMER

public static final int PLING_STEMMER
Type constant for Pling stemmer

See Also:
Constant Field Values
Invisible:

PORTER_STEMMER

public static final int PORTER_STEMMER
Type constant for Porter stemmer

See Also:
Constant Field Values
Invisible:

LANCASTER_STEMMER

public static final int LANCASTER_STEMMER
Type constant for Lancaster stemmer

See Also:
Constant Field Values
Invisible:

LOVINS_STEMMER

public static final int LOVINS_STEMMER
Type constant for Pacie-Husk stemmer

See Also:
Constant Field Values
Invisible:
Constructor Detail

RiStemmer

public RiStemmer()
Invisible:

RiStemmer

public RiStemmer(processing.core.PApplet p)
Creates a default stemmer


RiStemmer

public RiStemmer(processing.core.PApplet p,
                 int stemmerType)
Invisible:
Method Detail

getStemmer

public RiStemmerIF getStemmer()
Returns the concrete stemmer (delegate) object that actually does the work

Invisible:

stem

public java.lang.String stem(java.lang.String word)
Extracts base roots from a word by lower-casing it, then removing prefixes and suffixes. For example, the words 'run', 'runs', 'ran', and 'running' all have "run" as their stem.

Specified by:
stem in interface RiStemmerIF

main

public static void main(java.lang.String[] args)