rita
Class RiTravesty

java.lang.Object
  extended by rita.RiObject
      extended by rita.RiTravesty
All Implemented Interfaces:
processing.core.PConstants, RiConstants

public class RiTravesty
extends RiObject

Represents a Markov chain (or n-Gram) model that treats each character as a separate token (a la the original Travesty program). Provides a range of methods (see examples below) to query the model for probabilites and/or completions.


    RiTravesty rm = new RiTravesty(this, 5);

    rm.setEndOfSequenceChar(' ');                // treats each word as separate input

    rm.loadFile("myTestFile.txt");               // load the file (w' no multiplier)

    rm.printTree();                              // prints the model as a tree

                            ---------------------------

    println("p(g)="+rm.getProbability("g"));     // the probability of 'g'

    println("p(X|g)="+rm.getProbabilities("g")); // probabilities for next letters(X) after g, p(X|g) 

    String comp = rm.getCompletion("gr");        // returns a prob. random completion 

    String[] comps = rm.getCompletions("gr");    // returns all possible completions
 

Invisible:

Field Summary
static char NULL_CHAR
           
 
Fields inherited from interface rita.support.RiConstants
BEHAVIOR_COMPLETED, BOUNDING_BOX_ALPHA, BRILL_POS_TAGGER, EASE_IN, EASE_IN_CUBIC, EASE_IN_EXPO, EASE_IN_OUT, EASE_IN_OUT_CUBIC, EASE_IN_OUT_EXPO, EASE_IN_OUT_QUARTIC, EASE_IN_OUT_SINE, EASE_IN_QUARTIC, EASE_IN_SINE, EASE_OUT, EASE_OUT_CUBIC, EASE_OUT_EXPO, EASE_OUT_QUARTIC, EASE_OUT_SINE, ESS, FADE_COLOR, FADE_IN, FADE_OUT, FADE_TO_TEXT, FIRST_PERSON, FUTURE_TENSE, ID, LERP, LINEAR, MAXENT_POS_TAGGER, MINIM, MOVE, MUTABLE, PAST_TENSE, PHONEME_BOUNDARY, PHONEMES, PLING_STEMMER, PLURAL, PORTER_STEMMER, POS, PRESENT_TENSE, SCALE_TO, SECOND_PERSON, SENTENCE_BOUNDARY, SINGULAR, SONIA, SPEECH_COMPLETED, STRESSES, SYLLABLE_BOUNDARY, SYLLABLES, TEXT, TEXT_ENTERED, THIRD_PERSON, TIMER, TIMER_COMPLETED, TIMER_TICK, TOKENS, UNKNOWN, WORD_BOUNDARY
 
Fields inherited from interface processing.core.PConstants
A, AB, ADD, AG, ALPHA, ALPHA_MASK, ALT, AMBIENT, AR, ARC, ARGB, ARROW, B, BACKSPACE, BASELINE, BEEN_LIT, BEVEL, BLEND, BLUE_MASK, BLUR, BOTTOM, BOX, BURN, CENTER, CENTER_DIAMETER, CENTER_RADIUS, CHATTER, CLOSE, CMYK, CODED, COMPLAINT, CONTROL, CORNER, CORNERS, CROSS, CUSTOM, DA, DARKEST, DB, DEG_TO_RAD, DELETE, DG, DIAMETER, DIFFERENCE, DILATE, DIRECTIONAL, DISABLE_ACCURATE_TEXTURES, DISABLE_DEPTH_SORT, DISABLE_DEPTH_TEST, DISABLE_OPENGL_2X_SMOOTH, DISABLE_OPENGL_ERROR_REPORT, DODGE, DOWN, DR, DXF, EB, EDGE, EG, ELLIPSE, ENABLE_ACCURATE_TEXTURES, ENABLE_DEPTH_SORT, ENABLE_DEPTH_TEST, ENABLE_NATIVE_FONTS, ENABLE_OPENGL_2X_SMOOTH, ENABLE_OPENGL_4X_SMOOTH, ENABLE_OPENGL_ERROR_REPORT, ENTER, EPSILON, ER, ERODE, ERROR_BACKGROUND_IMAGE_FORMAT, ERROR_BACKGROUND_IMAGE_SIZE, ERROR_PUSHMATRIX_OVERFLOW, ERROR_PUSHMATRIX_UNDERFLOW, ERROR_TEXTFONT_NULL_PFONT, ESC, EXCLUSION, G, GIF, GRAY, GREEN_MASK, HALF_PI, HAND, HARD_LIGHT, HINT_COUNT, HSB, IMAGE, INVERT, JAVA2D, JPEG, LEFT, LIGHTEST, LINE, LINES, LINUX, MACOSX, MAX_FLOAT, MAX_INT, MIN_FLOAT, MIN_INT, MITER, MODEL, MULTIPLY, NORMAL, NORMALIZED, NX, NY, NZ, OPAQUE, OPEN, OPENGL, ORTHOGRAPHIC, OTHER, OVERLAY, P2D, P3D, PATH, PDF, PERSPECTIVE, PI, platformNames, POINT, POINTS, POLYGON, POSTERIZE, PROBLEM, PROJECT, QUAD, QUAD_STRIP, QUADS, QUARTER_PI, R, RAD_TO_DEG, RADIUS, RECT, RED_MASK, REPLACE, RETURN, RGB, RIGHT, ROUND, SA, SB, SCREEN, SG, SHAPE, SHIFT, SHINE, SOFT_LIGHT, SPB, SPG, SPHERE, SPOT, SPR, SQUARE, SR, SUBTRACT, SW, TAB, TARGA, THIRD_PI, THRESHOLD, TIFF, TOP, TRIANGLE, TRIANGLE_FAN, TRIANGLE_STRIP, TRIANGLES, TWO_PI, TX, TY, TZ, U, UP, V, VERTEX_FIELD_COUNT, VW, VX, VY, VZ, WAIT, WHITESPACE, WINDOWS, X, Y, Z
 
Constructor Summary
RiTravesty(processing.core.PApplet parent, int nFactor)
          Construct a Markov (or n-gram) model and set its n-factor
 
Method Summary
 boolean containsChar(char token)
          Returns true if the model contains the token in any position, else false.
 java.lang.String getCompletion(java.lang.String start)
          Chooses a single completion (if there are more than 1) based upon probabilistic random choices.
 char getEndOfSequenceChar()
          Returns the character marking the end of an input sequence;
 int getNFactor()
          Returns the current n-value for the model
 java.util.Map getProbabilities(java.lang.String path)
          Returns the full set of possible next tokens (as a HashMap: String -> Float (probability)) given an array of tokens representing the path down the tree (with length less than n).
 float getProbability(char token)
          Returns the raw (unigram) probability for a token in the model, or 0 if it does not exist
 float getProbability(java.lang.String tokens)
          Returns the probability of obtaining a sequence of k character tokens were k <= nFactor, e.g., if nFactor = 3, then valid lengths for the String tokens are 1, 2 & 3.
 boolean isSmoothing()
          Returns whether (add-1) smoothing is enabled for the model
 void loadCharData(java.lang.String rawText, int multiplier)
          Load a String into the model, treating each character as separate entity
 void loadFile(java.lang.String fileName)
           
 void loadFile(java.lang.String fileName, int multiplier)
          Load a text file into the model -- if using Processing, the file should be in the sketch's data folder.
static void main(java.lang.String[] args)
           
 void printTree()
           
 void printTree(boolean sort)
           
 void printTree(java.io.PrintStream pw)
           
 void printTree(java.io.PrintStream printStream, boolean sort)
          Outputs a String representing the models probability tree using the supplied print stream (or System.out).
 void setEndOfSequenceChar(char endOfSequenceChar)
          Sets the character to mark the end of an input sequence.
 void setUseSmoothing(boolean useSmoothing)
          Toggles whether (add-1) smoothing is enabled for the model
 
Methods inherited from class rita.RiObject
dispose, getId, getPApplet, nextId
 
Methods inherited from class java.lang.Object
equals, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Field Detail

NULL_CHAR

public static final char NULL_CHAR
See Also:
Constant Field Values
Invisible:
Constructor Detail

RiTravesty

public RiTravesty(processing.core.PApplet parent,
                  int nFactor)
Construct a Markov (or n-gram) model and set its n-factor

Method Detail

loadFile

public void loadFile(java.lang.String fileName,
                     int multiplier)
Load a text file into the model -- if using Processing, the file should be in the sketch's data folder.

Parameters:
fileName - name of file to load
multiplier - weighting for tokens in the file;
a weight of 3 is equivalent to loading that file 3 times and gives each token 3x the probability of being chosen on a call to generate().

loadFile

public void loadFile(java.lang.String fileName)

loadCharData

public void loadCharData(java.lang.String rawText,
                         int multiplier)
Load a String into the model, treating each character as separate entity

Parameters:
multiplier - Weighting for tokens in the String
A weight of 3 is equivalent to loading that file 3 times and gives each token 3x the probability of being chosen on a call to generate().

printTree

public void printTree(java.io.PrintStream printStream,
                      boolean sort)
Outputs a String representing the models probability tree using the supplied print stream (or System.out).

NOTE: this method will block for potentially long periods of time on large models.

Parameters:
printStream - where to send the output (default=System.out)
sort - whether the tree is first sorted (by frequency) before being output

printTree

public void printTree(boolean sort)

printTree

public void printTree(java.io.PrintStream pw)

printTree

public void printTree()

getNFactor

public int getNFactor()
Returns the current n-value for the model


isSmoothing

public boolean isSmoothing()
Returns whether (add-1) smoothing is enabled for the model


setUseSmoothing

public void setUseSmoothing(boolean useSmoothing)
Toggles whether (add-1) smoothing is enabled for the model


containsChar

public boolean containsChar(char token)
Returns true if the model contains the token in any position, else false.


getProbability

public float getProbability(char token)
Returns the raw (unigram) probability for a token in the model, or 0 if it does not exist


getProbability

public float getProbability(java.lang.String tokens)
Returns the probability of obtaining a sequence of k character tokens were k <= nFactor, e.g., if nFactor = 3, then valid lengths for the String tokens are 1, 2 & 3.


getProbabilities

public java.util.Map getProbabilities(java.lang.String path)
Returns the full set of possible next tokens (as a HashMap: String -> Float (probability)) given an array of tokens representing the path down the tree (with length less than n). If the input array length is not less than n, or the path cannot be found, or the endnode has no children, null is returned.

As the returned Map represents the full set of possible next tokens, the sum of its probabilities will alwyas equal 1.

See Also:
getProbability(String)

getCompletion

public java.lang.String getCompletion(java.lang.String start)
Chooses a single completion (if there are more than 1) based upon probabilistic random choices.


getEndOfSequenceChar

public char getEndOfSequenceChar()
Returns the character marking the end of an input sequence;


setEndOfSequenceChar

public void setEndOfSequenceChar(char endOfSequenceChar)
Sets the character to mark the end of an input sequence. To parse each word token separately, for example, set this to Character a space - default is Character.EOF.

Parameters:
endOfSequenceChar -

main

public static void main(java.lang.String[] args)