RiTa
index
Name RiParser
Description Tree-based parser for recursive syntactic annotations, e.g., noun-phrases, using the Penn conventions.
An example:
     String s = "The black cat crossed my path.";
     RiParser parser = new RiParser();
     String result = parser.parse(s);
     System.out.println(result);
Note: to use this object, first download the rita statistical models (rita.me.models.zip) and unpack them into the 'rita' directory in your libraries directory within your processing sketchbook, e.g., $SKETCH_PAD/libraries/rita/models. You may also specify an alternative directory (an absolute path) for the models via RiTa.setModelDir();

This object is most useful when used with the RiTaServer as it can take significant time to load the necessary statisical models.

Based closely on the OpenNLP maximum entropy parser.
For more info see: Berger & Della Pietra's paper 'A Maximum
Entropy Approach to Natural Language Processing', which
provides a good introduction to the maxent framework.

The full tag set follows:

  • S - simple declarative clause, i.e. one that is not introduced by a (possible empty) subordinating conjunction or a wh-word and that does not exhibit subject-verb inversion.
  • SBAR - Clause introduced by a (possibly empty) subordinating conjunction.
  • SBARQ - Direct question introduced by a wh-word or a wh-phrase. Indirect questions and relative clauses should be bracketed as SBAR, not SBARQ.
  • SINV - Inverted declarative sentence, i.e. one in which the subject follows the tensed verb or modal.
  • SQ - Inverted yes/no question, or main clause of a wh-question, following the wh-phrase in SBARQ.
  • Phrase Level
  • ADJP - Adjective Phrase.
  • ADVP - Adverb Phrase.
  • CONJP - Conjunction Phrase.
  • FRAG - Fragment.
  • INTJ - Interjection. Corresponds approximately to the part-of-speech tag UH.
  • LST - List marker. Includes surrounding punctuation.
  • NAC - Not a Constituent; used to show the scope of certain prenominal modifiers within an NP.
  • NP - Noun Phrase.
  • NX - Used within certain complex NPs to mark the head of the NP. Corresponds very roughly to N-bar level but used quite differently.
  • PP - Prepositional Phrase.
  • PRN - Parenthetical.
  • PRT - Particle. Category for words that should be tagged RP.
  • QP - Quantifier Phrase (i.e. complex measure/amount phrase); used within NP.
  • RRC - Reduced Relative Clause.
  • UCP - Unlike Coordinated Phrase.
  • VP - Vereb Phrase.
  • WHADJP - Wh-adjective Phrase. Adjectival phrase containing a wh-adverb, as in how hot.
  • WHAVP - Wh-adverb Phrase. Introduces a clause with an NP gap. May be null (containing the 0 complementizer) or lexical, containing a wh-adverb such as how or why.
  • WHNP - Wh-noun Phrase. Introduces a clause with an NP gap. May be null (containing the 0 complementizer) or lexical, containing some wh-word, e.g. who, which book, whose daughter, none of which, or how many leopards.
  • WHPP - Wh-prepositional Phrase. Prepositional phrase containing a wh-noun phrase (such as of which or by whose authority) that either introduces a PP gap or is contained by a WHNP.
  • X - Unknown,simple
Constructors
RiParser(pApplet);
Methods
parse()   Returns the String of the most probable parse using Penn Treebank-style formatting with or without indents depending on the paramater indent.

Usage Web & Application