rita
Class RiParser

java.lang.Object
  extended by rita.RiObject
      extended by rita.RiParser
All Implemented Interfaces:
processing.core.PConstants, RiParserIF, RiConstants

public class RiParser
extends RiObject
implements RiParserIF

Tree-based parser for recursive syntactic annotations, e.g., noun-phrases, using the Penn conventions.
An example:

     String s = "The black cat crossed my path.";
     RiParser parser = new RiParser();
     String result = parser.parse(s);
     System.out.println(result);
Note: to use this object, first download the rita statistical models (rita.me.models.zip) and unpack them into the 'rita' directory in your libraries directory within your processing sketchbook, e.g., $SKETCH_PAD/libraries/rita/models. You may also specify an alternative directory (an absolute path) for the models via RiTa.setModelDir();

This object is most useful when used with the RiTaServer as it can take significant time to load the necessary statisical models.

    RiTa.useServer(portNumber);
    RiTa.setModelDir("/models");
    String s = "The black cat crossed my path.";
    RiParser rp = new RiParser();     
    System.out.println(rp.parse(s));
Primarily just a wrapper for the OpenNLP(http://opennlp.sourceforge.net) parser with some minor modifications/simplifications.

For more info see: Berger & Della Pietra's paper: 'A Maximum
Entropy Approach to Natural Language Processing', which
provides a good introduction to the maxent framework.

The full tag set follows:

See Also:
RiTaServer
Invisible:

Field Summary
 
Fields inherited from interface rita.support.RiConstants
BEHAVIOR_COMPLETED, BOUNDING_BOX_ALPHA, BRILL_POS_TAGGER, EASE_IN, EASE_IN_CUBIC, EASE_IN_EXPO, EASE_IN_OUT, EASE_IN_OUT_CUBIC, EASE_IN_OUT_EXPO, EASE_IN_OUT_QUARTIC, EASE_IN_OUT_SINE, EASE_IN_QUARTIC, EASE_IN_SINE, EASE_OUT, EASE_OUT_CUBIC, EASE_OUT_EXPO, EASE_OUT_QUARTIC, EASE_OUT_SINE, ESS, FADE_COLOR, FADE_IN, FADE_OUT, FADE_TO_TEXT, FIRST_PERSON, FUTURE_TENSE, ID, LERP, LINEAR, MAXENT_POS_TAGGER, MINIM, MOVE, MUTABLE, PAST_TENSE, PHONEME_BOUNDARY, PHONEMES, PLING_STEMMER, PLURAL, PORTER_STEMMER, POS, PRESENT_TENSE, SCALE_TO, SECOND_PERSON, SENTENCE_BOUNDARY, SINGULAR, SONIA, SPEECH_COMPLETED, STRESSES, SYLLABLE_BOUNDARY, SYLLABLES, TEXT, TEXT_ENTERED, THIRD_PERSON, TIMER, TIMER_COMPLETED, TIMER_TICK, TOKENS, UNKNOWN, WORD_BOUNDARY
 
Fields inherited from interface processing.core.PConstants
A, AB, ADD, AG, ALPHA, ALPHA_MASK, ALT, AMBIENT, AR, ARC, ARGB, ARROW, B, BACKSPACE, BASELINE, BEEN_LIT, BEVEL, BLEND, BLUE_MASK, BLUR, BOTTOM, BOX, BURN, CENTER, CENTER_DIAMETER, CENTER_RADIUS, CHATTER, CLOSE, CMYK, CODED, COMPLAINT, CONTROL, CORNER, CORNERS, CROSS, CUSTOM, DA, DARKEST, DB, DEG_TO_RAD, DELETE, DG, DIAMETER, DIFFERENCE, DILATE, DIRECTIONAL, DISABLE_ACCURATE_TEXTURES, DISABLE_DEPTH_SORT, DISABLE_DEPTH_TEST, DISABLE_OPENGL_2X_SMOOTH, DISABLE_OPENGL_ERROR_REPORT, DODGE, DOWN, DR, DXF, EB, EDGE, EG, ELLIPSE, ENABLE_ACCURATE_TEXTURES, ENABLE_DEPTH_SORT, ENABLE_DEPTH_TEST, ENABLE_NATIVE_FONTS, ENABLE_OPENGL_2X_SMOOTH, ENABLE_OPENGL_4X_SMOOTH, ENABLE_OPENGL_ERROR_REPORT, ENTER, EPSILON, ER, ERODE, ERROR_BACKGROUND_IMAGE_FORMAT, ERROR_BACKGROUND_IMAGE_SIZE, ERROR_PUSHMATRIX_OVERFLOW, ERROR_PUSHMATRIX_UNDERFLOW, ERROR_TEXTFONT_NULL_PFONT, ESC, EXCLUSION, G, GIF, GRAY, GREEN_MASK, HALF_PI, HAND, HARD_LIGHT, HINT_COUNT, HSB, IMAGE, INVERT, JAVA2D, JPEG, LEFT, LIGHTEST, LINE, LINES, LINUX, MACOSX, MAX_FLOAT, MAX_INT, MIN_FLOAT, MIN_INT, MITER, MODEL, MULTIPLY, NORMAL, NORMALIZED, NX, NY, NZ, OPAQUE, OPEN, OPENGL, ORTHOGRAPHIC, OTHER, OVERLAY, P2D, P3D, PATH, PDF, PERSPECTIVE, PI, platformNames, POINT, POINTS, POLYGON, POSTERIZE, PROBLEM, PROJECT, QUAD, QUAD_STRIP, QUADS, QUARTER_PI, R, RAD_TO_DEG, RADIUS, RECT, RED_MASK, REPLACE, RETURN, RGB, RIGHT, ROUND, SA, SB, SCREEN, SG, SHAPE, SHIFT, SHINE, SOFT_LIGHT, SPB, SPG, SPHERE, SPOT, SPR, SQUARE, SR, SUBTRACT, SW, TAB, TARGA, THIRD_PI, THRESHOLD, TIFF, TOP, TRIANGLE, TRIANGLE_FAN, TRIANGLE_STRIP, TRIANGLES, TWO_PI, TX, TY, TZ, U, UP, V, VERTEX_FIELD_COUNT, VW, VX, VY, VZ, WAIT, WHITESPACE, WINDOWS, X, Y, Z
 
Constructor Summary
RiParser()
          Note: when using this constructor, the Processing 'sketchpad' directory will NOT be checked for models.
RiParser(processing.core.PApplet pApplet)
           
 
Method Summary
static void main(java.lang.String[] args)
           
 java.lang.String parse(java.lang.String text)
          Returns the String of the most probable parse using Penn Treebank-style formatting.
 java.lang.String parse(java.lang.String text, boolean indent)
          Returns the String of the most probable parse using Penn Treebank-style formatting with or without indents depending on the paramater indent.
 
Methods inherited from class rita.RiObject
dispose, getId, getPApplet, nextId
 
Methods inherited from class java.lang.Object
equals, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Constructor Detail

RiParser

public RiParser()
Note: when using this constructor, the Processing 'sketchpad' directory will NOT be checked for models.

See Also:
RiParser(PApplet)
Invisible:

RiParser

public RiParser(processing.core.PApplet pApplet)
Method Detail

parse

public java.lang.String parse(java.lang.String text,
                              boolean indent)
Returns the String of the most probable parse using Penn Treebank-style formatting with or without indents depending on the paramater indent.


parse

public java.lang.String parse(java.lang.String text)
Returns the String of the most probable parse using Penn Treebank-style formatting.

Specified by:
parse in interface RiParserIF

main

public static void main(java.lang.String[] args)