rita
Class RiAnalyzer

java.lang.Object
  extended by rita.RiObject
      extended by rita.RiAnalyzer
All Implemented Interfaces:
processing.core.PConstants, RiConstants

public class RiAnalyzer
extends RiObject

Analyzes String phrases, annotating each phrase and each contained word with 'feature' data. Default features include: word-boundaries, pos, phonemes, stresses, syllables, etc.

    String text = "The boy jumped over the wild dog.";
    RiAnalyzer ra = new RiAnalyzer(this);
    ra.analyze(text);

    String phonemes = ra.getPhonemes();   
    String stresses = ra.getStresses();
    String syllables = ra.getSyllables();
    String partsOfSpeech = ra.getPos();
Note: RiString (and RiText) objects can also be analyzed:
      RiString rt = new RiString("The boy ran to the store.");
      ra.analyze(rt);
      String phonemes = rt.getPhonemes();
And additional (custom) features can be added by the user by creating a subclass and overriding the analyze method as follows:
      RiAnalyzer ra = new RiAnalyzer(this) {
        public void analyze(RiText rt) {
          super.analyze(rt);
          // add custom features here
          rt.addFeature("featureName", "featureValue");
        }
      };


Field Summary
 
Fields inherited from interface rita.support.RiConstants
BEHAVIOR_COMPLETED, BOUNDING_BOX_ALPHA, BRILL_POS_TAGGER, EASE_IN, EASE_IN_CUBIC, EASE_IN_EXPO, EASE_IN_OUT, EASE_IN_OUT_CUBIC, EASE_IN_OUT_EXPO, EASE_IN_OUT_QUARTIC, EASE_IN_OUT_SINE, EASE_IN_QUARTIC, EASE_IN_SINE, EASE_OUT, EASE_OUT_CUBIC, EASE_OUT_EXPO, EASE_OUT_QUARTIC, EASE_OUT_SINE, ESS, FADE_COLOR, FADE_IN, FADE_OUT, FADE_TO_TEXT, FIRST_PERSON, FUTURE_TENSE, ID, LERP, LINEAR, MAXENT_POS_TAGGER, MINIM, MOVE, MUTABLE, PAST_TENSE, PHONEME_BOUNDARY, PHONEMES, PLING_STEMMER, PLURAL, PORTER_STEMMER, POS, PRESENT_TENSE, SCALE_TO, SECOND_PERSON, SENTENCE_BOUNDARY, SINGULAR, SONIA, SPEECH_COMPLETED, STRESSES, SYLLABLE_BOUNDARY, SYLLABLES, TEXT, TEXT_ENTERED, THIRD_PERSON, TIMER, TIMER_COMPLETED, TIMER_TICK, TOKENS, UNKNOWN, WORD_BOUNDARY
 
Fields inherited from interface processing.core.PConstants
A, AB, ADD, AG, ALPHA, ALPHA_MASK, ALT, AMBIENT, AR, ARC, ARGB, ARROW, B, BACKSPACE, BASELINE, BEEN_LIT, BEVEL, BLEND, BLUE_MASK, BLUR, BOTTOM, BOX, BURN, CENTER, CENTER_DIAMETER, CENTER_RADIUS, CHATTER, CLOSE, CMYK, CODED, COMPLAINT, CONTROL, CORNER, CORNERS, CROSS, CUSTOM, DA, DARKEST, DB, DEG_TO_RAD, DELETE, DG, DIAMETER, DIFFERENCE, DILATE, DIRECTIONAL, DISABLE_ACCURATE_TEXTURES, DISABLE_DEPTH_SORT, DISABLE_DEPTH_TEST, DISABLE_OPENGL_2X_SMOOTH, DISABLE_OPENGL_ERROR_REPORT, DODGE, DOWN, DR, DXF, EB, EDGE, EG, ELLIPSE, ENABLE_ACCURATE_TEXTURES, ENABLE_DEPTH_SORT, ENABLE_DEPTH_TEST, ENABLE_NATIVE_FONTS, ENABLE_OPENGL_2X_SMOOTH, ENABLE_OPENGL_4X_SMOOTH, ENABLE_OPENGL_ERROR_REPORT, ENTER, EPSILON, ER, ERODE, ERROR_BACKGROUND_IMAGE_FORMAT, ERROR_BACKGROUND_IMAGE_SIZE, ERROR_PUSHMATRIX_OVERFLOW, ERROR_PUSHMATRIX_UNDERFLOW, ERROR_TEXTFONT_NULL_PFONT, ESC, EXCLUSION, G, GIF, GRAY, GREEN_MASK, HALF_PI, HAND, HARD_LIGHT, HINT_COUNT, HSB, IMAGE, INVERT, JAVA2D, JPEG, LEFT, LIGHTEST, LINE, LINES, LINUX, MACOSX, MAX_FLOAT, MAX_INT, MIN_FLOAT, MIN_INT, MITER, MODEL, MULTIPLY, NORMAL, NORMALIZED, NX, NY, NZ, OPAQUE, OPEN, OPENGL, ORTHOGRAPHIC, OTHER, OVERLAY, P2D, P3D, PATH, PDF, PERSPECTIVE, PI, platformNames, POINT, POINTS, POLYGON, POSTERIZE, PROBLEM, PROJECT, QUAD, QUAD_STRIP, QUADS, QUARTER_PI, R, RAD_TO_DEG, RADIUS, RECT, RED_MASK, REPLACE, RETURN, RGB, RIGHT, ROUND, SA, SB, SCREEN, SG, SHAPE, SHIFT, SHINE, SOFT_LIGHT, SPB, SPG, SPHERE, SPOT, SPR, SQUARE, SR, SUBTRACT, SW, TAB, TARGA, THIRD_PI, THRESHOLD, TIFF, TOP, TRIANGLE, TRIANGLE_FAN, TRIANGLE_STRIP, TRIANGLES, TWO_PI, TX, TY, TZ, U, UP, V, VERTEX_FIELD_COUNT, VW, VX, VY, VZ, WAIT, WHITESPACE, WINDOWS, X, Y, Z
 
Constructor Summary
RiAnalyzer()
           
RiAnalyzer(boolean enableCaching)
           
RiAnalyzer(processing.core.PApplet pApplet)
          Default constructor for RiAnalyzer
RiAnalyzer(processing.core.PApplet pApplet, boolean enableCaching)
          Constructor for RiAnalyzer with boolean specifying whether to enable the cache.
RiAnalyzer(processing.core.PApplet pApplet, int taggerType, boolean enableCaching)
          Constructor for RiAnalyzer that specifies a specific PosTagger type to use, e.g., RiPosTagger.MAXENT_POS_TAGGER and a flag to indicate whether to enable the cache.
RiAnalyzer(processing.core.PApplet p, RiPhrase[] phrase, int taggerType, boolean enableCaching)
           
 
Method Summary
 void analyze(java.lang.CharSequence text)
          Sets text as the current phrase and analyzes it, so that subsequent calls to methods like getPos(), getPhonemes(), getSyllables(), etc.
 void analyze(RiCharSequence rcs)
          Sets the text contained by rt as the current phrase and analyzes it, so that subsequent calls to methods like getPos(), getPhonemes(), getSyllables(), etc.
 void analyze(RiText[] rts)
          Analyzes an array of each RiTexts, setting the appropriate features for each element assuming that each holds a single word.
 void analyze(java.lang.String text)
          Sets text as the current phrase and analyzes it, so that subsequent calls to methods like getPos(), getPhonemes(), getSyllables(), etc.
 void dumpFeatures()
          Prints the features of the last analyzed text to System.out
static void example(java.lang.String[] args)
           
 int firstIdx(java.lang.String word)
          Returns the (1st) index of word or -1 if not found
 java.util.Set getAvailableFeatures()
          Returns the Set of available features
 int getCallCount()
          Returns the number of non-cached lookups made by this object so far
 java.lang.String getFeature(java.lang.String featureName)
          Returns the feature specified by name.
 java.util.Map getFeatures()
          Returns a Map (of String key-value pairs) of all the features for the last analyzed phrase
 java.util.Map getFeatures(int wordIdx)
          Returns a Map (of String key-value pairs) of all the features for the word at the specified word-index.
 java.lang.String getFeatureString()
          Returns a String representation of the feature list for the last analyzed text
 java.lang.String getPhonemes()
          Returns a String containing all phonemes for the input text, delimited by semi-colons, e.g., "dh:ax:d:ao:g:r:ae:n:f:ae:s:t", or null if no text has been input.
 java.lang.String getPhonemesAt(int wordIdx)
          Returns the phonemes for the word at wordIdx
 java.lang.String getPos()
          Returns a String containing all pos tags for the input text, delimited by semi-colons, e.g., "dt:nn:vbd:rb", or null if no text has been input.
 java.lang.String getPosAt(int wordIdx)
          Returns the pos for the word at wordIdx
 java.lang.String getStresses()
          Returns a String containing the stresses for each syllable of the input text, delimited by semi-colons, e.g., "0:1:0:1", with 1's meaning 'stressed', and 0's meaning 'unstressed', or null if no text has been input.
 java.lang.String getStressesAt(int wordIdx)
          Returns the stresses for the word at wordIdx
 java.lang.String getSyllables()
          Returns a String containing the phonemes for each syllable of each word of the input text, delimited by dashes (phonemes) and semi-colons (words), e.g., "dh-ax:d-ao-g:r-ae-n:f-ae-s-t" for the 4 syllables of the phrase 'The dog ran fast', or null if no text has been input.
 java.lang.String getText()
          Returns the last analyzed text
 java.lang.String[] getTokens()
          Returns an array of the words (no punctuation) in the current text, or null if no text has been input.
 boolean isCacheEnabled()
           
 int lastIdx(java.lang.String word)
          Returns the index of the last token matching word or -1 if not found
static void main(java.lang.String[] args)
           
 java.lang.String rhymeScheme(java.lang.String[] lines)
          Returns the rhyme scheme for a given set of lines.
 void setCacheEnabled(boolean cacheEnabled)
           
static void setDefaultTagger(int taggerType)
           
 void setText(java.lang.String text)
          Deprecated.  
 java.lang.String tokenAt(int wordIdx)
          Returns the word at index wordIdx
 int wordCount()
          Returns the # of words in the current text
 
Methods inherited from class rita.RiObject
dispose, getId, getPApplet, nextId
 
Methods inherited from class java.lang.Object
equals, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Constructor Detail

RiAnalyzer

public RiAnalyzer(processing.core.PApplet pApplet)
Default constructor for RiAnalyzer


RiAnalyzer

public RiAnalyzer(boolean enableCaching)
Invisible:

RiAnalyzer

public RiAnalyzer()
Invisible:

RiAnalyzer

public RiAnalyzer(processing.core.PApplet pApplet,
                  int taggerType,
                  boolean enableCaching)
Constructor for RiAnalyzer that specifies a specific PosTagger type to use, e.g., RiPosTagger.MAXENT_POS_TAGGER and a flag to indicate whether to enable the cache.


RiAnalyzer

public RiAnalyzer(processing.core.PApplet p,
                  RiPhrase[] phrase,
                  int taggerType,
                  boolean enableCaching)
Invisible:

RiAnalyzer

public RiAnalyzer(processing.core.PApplet pApplet,
                  boolean enableCaching)
Constructor for RiAnalyzer with boolean specifying whether to enable the cache.

Method Detail

getCallCount

public int getCallCount()
Returns the number of non-cached lookups made by this object so far

See Also:
setCacheEnabled(boolean)

rhymeScheme

public java.lang.String rhymeScheme(java.lang.String[] lines)
Returns the rhyme scheme for a given set of lines.

Note: assumes all rhymes are end-rhymes, that is, happening on the last word of the lines.


setDefaultTagger

public static void setDefaultTagger(int taggerType)
Invisible:

getAvailableFeatures

public java.util.Set getAvailableFeatures()
Returns the Set of available features


getFeature

public java.lang.String getFeature(java.lang.String featureName)
Returns the feature specified by name.

Note: getFeature("pos") is equivalent to getPos().


getPhonemes

public java.lang.String getPhonemes()
Returns a String containing all phonemes for the input text, delimited by semi-colons, e.g., "dh:ax:d:ao:g:r:ae:n:f:ae:s:t", or null if no text has been input.


getPosAt

public java.lang.String getPosAt(int wordIdx)
Returns the pos for the word at wordIdx

Parameters:
wordIdx -

getPhonemesAt

public java.lang.String getPhonemesAt(int wordIdx)
Returns the phonemes for the word at wordIdx

Parameters:
wordIdx -

getStressesAt

public java.lang.String getStressesAt(int wordIdx)
Returns the stresses for the word at wordIdx

Parameters:
wordIdx -

tokenAt

public java.lang.String tokenAt(int wordIdx)
Returns the word at index wordIdx

Parameters:
wordIdx -

getFeatures

public java.util.Map getFeatures(int wordIdx)
Returns a Map (of String key-value pairs) of all the features for the word at the specified word-index.

Parameters:
wordIdx -

getPos

public java.lang.String getPos()
Returns a String containing all pos tags for the input text, delimited by semi-colons, e.g., "dt:nn:vbd:rb", or null if no text has been input.


getStresses

public java.lang.String getStresses()
Returns a String containing the stresses for each syllable of the input text, delimited by semi-colons, e.g., "0:1:0:1", with 1's meaning 'stressed', and 0's meaning 'unstressed', or null if no text has been input.


getSyllables

public java.lang.String getSyllables()
Returns a String containing the phonemes for each syllable of each word of the input text, delimited by dashes (phonemes) and semi-colons (words), e.g., "dh-ax:d-ao-g:r-ae-n:f-ae-s-t" for the 4 syllables of the phrase 'The dog ran fast', or null if no text has been input.


getTokens

public java.lang.String[] getTokens()
Returns an array of the words (no punctuation) in the current text, or null if no text has been input.


wordCount

public int wordCount()
Returns the # of words in the current text


firstIdx

public int firstIdx(java.lang.String word)
Returns the (1st) index of word or -1 if not found


lastIdx

public int lastIdx(java.lang.String word)
Returns the index of the last token matching word or -1 if not found


getText

public java.lang.String getText()
Returns the last analyzed text


setText

public void setText(java.lang.String text)
Deprecated. 

See Also:
analyze(String)
Invisible:

analyze

public void analyze(java.lang.String text)
Sets text as the current phrase and analyzes it, so that subsequent calls to methods like getPos(), getPhonemes(), getSyllables(), etc. will return immediately.


analyze

public void analyze(java.lang.CharSequence text)
Sets text as the current phrase and analyzes it, so that subsequent calls to methods like getPos(), getPhonemes(), getSyllables(), etc. will return immediately.


analyze

public void analyze(RiCharSequence rcs)
Sets the text contained by rt as the current phrase and analyzes it, so that subsequent calls to methods like getPos(), getPhonemes(), getSyllables(), etc. will return immediately.


getFeatureString

public java.lang.String getFeatureString()
Returns a String representation of the feature list for the last analyzed text


dumpFeatures

public void dumpFeatures()
Prints the features of the last analyzed text to System.out


analyze

public void analyze(RiText[] rts)
Analyzes an array of each RiTexts, setting the appropriate features for each element assuming that each holds a single word. Often used in conjunction with RiText.createWords().

Parameters:
rts -
See Also:
RiText.createWords(PApplet, String, float, float)

getFeatures

public java.util.Map getFeatures()
Returns a Map (of String key-value pairs) of all the features for the last analyzed phrase


isCacheEnabled

public boolean isCacheEnabled()
Invisible:

setCacheEnabled

public void setCacheEnabled(boolean cacheEnabled)
Invisible:

example

public static void example(java.lang.String[] args)
Invisible:

main

public static void main(java.lang.String[] args)
Invisible: