|
|||||||||
| PREV CLASS NEXT CLASS | FRAMES NO FRAMES | ||||||||
| SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD | ||||||||
public interface NGramIF
| Method Summary | |
|---|---|
java.lang.String |
generateTokens(int targetNumber)
Generates a string of |
java.lang.String |
generateUntil(java.lang.String regex,
int minLength,
int maxLength)
Continues generating tokens until a token matches 'regex', assuming the length of the output is between min and maxLength (inclusive). |
java.lang.String[] |
getCompletions(java.lang.String[] seed)
Returns all possible next words (or tokens), ordered by probability, for the given seed array, or null if none are found. |
java.lang.String[] |
getCompletions(java.lang.String[] pre,
java.lang.String[] post)
Returns an unordered list of possible words w that complete an n-gram consisting of: pre[0]...pre[k], w, post[k+1]...post[n]. |
int |
getNFactor()
Returns the current n-value for the model |
java.util.Map |
getProbabilities(java.lang.String[] path)
Returns the full set of possible next tokens (as a HashMap: String -> Float (probability)) given an array of tokens representing the path down the tree (with length less than n). |
float |
getProbability(java.lang.String singleToken)
Returns the raw (unigram) probability for a token in the model, or 0 if it does not exist |
float |
getProbability(java.lang.String[] tokens)
Returns the probability of obtaining a sequence of k character tokens were k <= nFactor, e.g., if nFactor = 3, then valid lengths for the String tokens are 1, 2 & 3. |
| Method Detail |
|---|
java.lang.String generateUntil(java.lang.String regex,
int minLength,
int maxLength)
java.lang.String generateTokens(int targetNumber)
lengthtokens from the model.
int getNFactor()
java.lang.String[] getCompletions(java.lang.String[] seed)
Note: seed arrays of any size (>0) may be input, but only the last n-1 elements will be considered.
float getProbability(java.lang.String singleToken)
float getProbability(java.lang.String[] tokens)
tokens are 1, 2 & 3.
java.lang.String[] getCompletions(java.lang.String[] pre,
java.lang.String[] post)
getCompletions(new String[]{ "the" }, new String[]{ "ball" })
will return all the single words that occur between 'the' and 'ball'
in the current model (assuming n > 2), e.g., ['red', 'big', 'bouncy']).
Note: For this operation to be valid, (pre.length + post.length) must be strictly less than the model's nFactor, otherwise an exception will be thrown.
java.util.Map getProbabilities(java.lang.String[] path)
Note: As the returned Map represents the full set of possible next tokens, the sum of its probabilities will always be equal 1.
getProbability(String)
|
|||||||||
| PREV CLASS NEXT CLASS | FRAMES NO FRAMES | ||||||||
| SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD | ||||||||