RiTa
index
Name RiParser
Description Tree-based parser for recursive syntactic annotations, e.g., noun-phrases, using the Penn conventions.
An example:
     String s = "The black cat crossed my path.";
     RiParser parser = new RiParser();
     String result = parser.parse(s);
     System.out.println(result);
Note: to use this object, first download the rita statistical models (rita.me.models.zip) and unpack them into the 'rita' directory in your libraries directory within your processing sketchbook, e.g., $SKETCH_PAD/libraries/rita/models. You may also specify an alternative directory (an absolute path) for the models via RiTa.setModelDir();

This object is most useful when used with the RiTaServer as it can take significant time to load the necessary statisical models.

Primarily just a wrapper for the OpenNLP(http://opennlp.sourceforge.net) parser with some minor modifications/simplifications.

For more info see: Berger & Della Pietra's paper: 'A Maximum
Entropy Approach to Natural Language Processing', which
provides a good introduction to the maxent framework.

The full tag set follows:

  • S - simple declarative clause, i.e. one that is not introduced by a (possible empty) subordinating conjunction or a wh-word and that does not exhibit subject-verb inversion.
  • SBAR - Clause introduced by a (possibly empty) subordinating conjunction.
  • SBARQ - Direct question introduced by a wh-word or a wh-phrase. Indirect questions and relative clauses should be bracketed as SBAR, not SBARQ.
  • SINV - Inverted declarative sentence, i.e. one in which the subject follows the tensed verb or modal.
  • SQ - Inverted yes/no question, or main clause of a wh-question, following the wh-phrase in SBARQ.
  • Phrase Level
  • ADJP - Adjective Phrase.
  • ADVP - Adverb Phrase.
  • CONJP - Conjunction Phrase.
  • FRAG - Fragment.
  • INTJ - Interjection. Corresponds approximately to the part-of-speech tag UH.
  • LST - List marker. Includes surrounding punctuation.
  • NAC - Not a Constituent; used to show the scope of certain prenominal modifiers within an NP.
  • NP - Noun Phrase.
  • NX - Used within certain complex NPs to mark the head of the NP. Corresponds very roughly to N-bar level but used quite differently.
  • PP - Prepositional Phrase.
  • PRN - Parenthetical.
  • PRT - Particle. Category for words that should be tagged RP.
  • QP - Quantifier Phrase (i.e. complex measure/amount phrase); used within NP.
  • RRC - Reduced Relative Clause.
  • UCP - Unlike Coordinated Phrase.
  • VP - Vereb Phrase.
  • WHADJP - Wh-adjective Phrase. Adjectival phrase containing a wh-adverb, as in how hot.
  • WHAVP - Wh-adverb Phrase. Introduces a clause with an NP gap. May be null (containing the 0 complementizer) or lexical, containing a wh-adverb such as how or why.
  • WHNP - Wh-noun Phrase. Introduces a clause with an NP gap. May be null (containing the 0 complementizer) or lexical, containing some wh-word, e.g. who, which book, whose daughter, none of which, or how many leopards.
  • WHPP - Wh-prepositional Phrase. Prepositional phrase containing a wh-noun phrase (such as of which or by whose authority) that either introduces a PP gap or is contained by a WHNP.
  • X - Unknown,simple
Constructors
RiParser(pApplet);
Methods
parse()   Returns the String of the most probable parse using Penn Treebank-style formatting with or without indents depending on the paramater indent.

Usage Web & Application