| containingStringsByLetter() |
|
Returns valid words (in lexicon) using both substring and superstring matching.
This method, CONTAINS(K), is equivalent to UNION( SUB(K), SUPER(K) ).
|
| contains() |
|
Returns true if the word exists in the lexicon
|
| getAlliterations() |
|
Finds alliterations by comparing the phonemes of the input string
to those of each word in the lexicon
|
| getFeatures() |
|
|
| getLexicalData() |
|
Returns the raw data (as a Map) used in the lexicon, allowing for
deletion or modification of existing lexical entires. Modifications
to this Map will be immediately reflected in all operations on the lexicon.
|
| getPosEntries() |
|
Return the list of possible parts-of-speech
for the word , or null if
not found.
|
| getPosStr() |
|
Returns
|
| getRandomWord() |
|
Returns a random word from the lexicon with the specified part-of-speech and target-length,
or null if no such word exists.
|
| getRandomWordWithSyllableCount() |
|
Returns a random word from the lexicon with the specified part-of-speech and syllable-count, or null if no such word exists.
|
| getRhymes() |
|
Returns the rhymes for a given word or null if none found
Two words rhyme if their final stressed vowel and all
following phonemes are identical.
|
| getWords() |
|
Returns the set of words in the lexicon (including those from user-addenda)
that match the supplied regular expression. For example, getWords("ee");
returns 661 words with 2 or more consecutive e's, while getWords("ee.*ee");
returns exactyl 2: 'freewheeling' and 'squeegee'.
|
| isAlliteration() |
|
Returns true if the first stressed consonant of the two words match, else false.
Note: returns true if wordA.equals(wordB) and false if either (or both) are null;
|
| isContaining() |
|
Returns true if orig is a sub or super-string of
toCheck.
|
| isRhyme() |
|
Returns true if the two words rhyme (that is, if their final stressed phoneme
and all following phonemes are identical) else false. Note: returns false
if wordA.equals(wordB) or if either (or both) are null;
Note: at present doesn't use letter-to-sound engine if either word
is not found in the lexicon, but instead just returns false. TODO
|
| isStopWord() |
|
Returns true if the word is a 'stop' (or 'closed-class') word
else false. See http://en.wikipedia.org/wiki/Stop_words
|
| isSubstring() |
|
Returns true if orig is a substring of
toCheck.
|
| isSuperstring() |
|
Returns true if orig is a superstring of
toCheck.
|
| iterator() |
|
Returns an iterator over the words in lexicon
matching the supplied regular expression.
|
| posIterator() |
|
Returns an iterator over the words in lexicon, for the supplied part-of-speech
|
| preloadFeatures() |
|
Use this method to preload the Lexicon with feature data (stress, syllables, pos, phones, etc).
Increases the initialization time but speeds up all subsequent lookups by an order of magnitude.
Useful when doing many lookups over the course of a program, especially with the RiTaServer. Example:
RiLexicon lex = new RiLexicon();
lex.preloadFeatures();
// use the lexicon
|
| RiLexicon.randomIterator() |
|
Utility method that returns a random-iterator over the specified set.
|
| randomPosIterator() |
|
Returns an iterator over the words in lexicon, for the supplied part-of-speech
beginning at a random offset.
|
| setLexicalData() |
|
Sets the raw data to be used in the lexicon, replacing
all default words and features with those specified in the map.
When using this method, be sure to exactly match the format
as specified rita_addenda.txt, e.g.,
##############################################################################
#### FORMAT##: ... | ...
##############################################################################
blog: b-l-ao-g | nn vbg
cepstral: k-eh1-p s-t-r-ax-l | nnp
freetts: f-r-iy1 t-iy t-iy eh-s | nnp
jsapi: jh-ey s-ae1-p iy | nnp
|
| similarByLetter() |
|
Compares the characters of the input string (using a version of the min-edit distance algorithm)
to each word in the lexicon, adding the set of closest matches to result,
considering all matches where the edit distance >= 'minMed'.
If 'preserveLength' is true, the method will favor words of the same length as the input.
|
| similarBySound() |
|
Compares the phonemes of the input String to those of each word in the
lexicon, returning the set of closest matches as a String[].
|
| similarBySoundAndLetter() |
|
First calls similarBySound(), then filters
the result set by the algorithm used in similarByLetter();
(useful when similarBySound() returns too large a result set)
|
| singleLetterDeletes() |
|
|
| singleLetterInsertions() |
|
|
| singleLetterSubtitutions() |
|
|
| substringsByLetter() |
|
Returns all valid substrings of the input word in the lexicon
of length at least minLength
|
| superstringsByLetter() |
|
Returns all valid superstrings of the input word in the lexicon
|
| RiLexicon.mainX() |
|
|
| RiLexicon.testRhymes() |
|
|