|
|||||||||
| PREV CLASS NEXT CLASS | FRAMES NO FRAMES | ||||||||
| SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD | ||||||||
java.lang.Objectrita.RiObject
rita.support.remote.RiRemotable
rita.support.me.RiObjectME
rita.support.me.MaxEntTagger
public class MaxEntTagger
Simple pos-tagger for the RiTa libary using the Penn tagset
Based closely on the OpenNLP maximum entropy tagger.
For more info see: Berger & Della Pietra's paper 'A Maximum Entropy Approach to Natural Language Processing', which provides a good introduction to the maxent framework.
The full Penn tag set follows:
CC Coordinating conjunction
CD Cardinal number
DT Determiner
EX Existential there
FW Foreign word
IN Preposition/subord. conjunction
JJ Adjective
JJR Adjective, comparative
JJS Adjective, superlative
LS List item marker
MD Modal
NN Noun, singular or mass
NNS Noun, plural
NNP Proper noun, singular
NNPS Proper noun, plural
PDT Predeterminer
POS Possessive ending
PRP Personal pronoun
PRP$ Possessive pronoun
RB Adverb
RBR Adverb, comparative
RBS Adverb, superlative
RP Particle
SYM Symbol (mathematical or scientific)
TO to
UH Interjection
VB Verb, base form
VBD Verb, past tense
VBG Verb, gerund/present participle
VBN Verb, past participle
VBP Verb, non-3rd ps. sing. present
VBZ Verb, 3rd ps. sing. present
WDT wh-determiner
WP wh-pronoun
WP$ Possessive wh-pronoun
WRB wh-adverb
# Pound sign
$ Dollar sign
. Sentence-final punctuation
, Comma
: Colon, semi-colon
( Left bracket character
) Right bracket character
" Straight double quote
` Left open single quote
" Left open double quote
' Right close single quote
" Right close double quote
- Right close double quote
| Field Summary |
|---|
| Fields inherited from class rita.support.me.RiObjectME |
|---|
ERROR_MSG, LOAD_FROM_MODEL_DIR |
| Fields inherited from interface rita.support.remote.RemoteConstants |
|---|
ARG_DELIM, ARR_DELIM, CHUNKER, DELIM, FS, LB, LP, MARKOV, PARSER, QQ, RB, RP, SPC, TAGGER, TYPE_DELIM |
| Fields inherited from interface processing.core.PConstants |
|---|
A, AB, ADD, AG, ALPHA, ALPHA_MASK, ALT, AMBIENT, AR, ARC, ARGB, ARROW, B, BACKSPACE, BASELINE, BEEN_LIT, BEVEL, BLEND, BLUE_MASK, BLUR, BOTTOM, BOX, BURN, CENTER, CENTER_DIAMETER, CENTER_RADIUS, CHATTER, CLOSE, CMYK, CODED, COMPLAINT, CONTROL, CORNER, CORNERS, CROSS, CUSTOM, DA, DARKEST, DB, DEG_TO_RAD, DELETE, DG, DIAMETER, DIFFERENCE, DILATE, DIRECTIONAL, DISABLE_ACCURATE_TEXTURES, DISABLE_DEPTH_SORT, DISABLE_DEPTH_TEST, DISABLE_OPENGL_2X_SMOOTH, DISABLE_OPENGL_ERROR_REPORT, DODGE, DOWN, DR, DXF, EB, EDGE, EG, ELLIPSE, ENABLE_ACCURATE_TEXTURES, ENABLE_DEPTH_SORT, ENABLE_DEPTH_TEST, ENABLE_NATIVE_FONTS, ENABLE_OPENGL_2X_SMOOTH, ENABLE_OPENGL_4X_SMOOTH, ENABLE_OPENGL_ERROR_REPORT, ENTER, EPSILON, ER, ERODE, ERROR_BACKGROUND_IMAGE_FORMAT, ERROR_BACKGROUND_IMAGE_SIZE, ERROR_PUSHMATRIX_OVERFLOW, ERROR_PUSHMATRIX_UNDERFLOW, ERROR_TEXTFONT_NULL_PFONT, ESC, EXCLUSION, G, GIF, GRAY, GREEN_MASK, HALF_PI, HAND, HARD_LIGHT, HINT_COUNT, HSB, IMAGE, INVERT, JAVA2D, JPEG, LEFT, LIGHTEST, LINE, LINES, LINUX, MACOSX, MAX_FLOAT, MAX_INT, MIN_FLOAT, MIN_INT, MITER, MODEL, MULTIPLY, NORMAL, NORMALIZED, NX, NY, NZ, OPAQUE, OPEN, OPENGL, ORTHOGRAPHIC, OTHER, OVERLAY, P2D, P3D, PATH, PDF, PERSPECTIVE, PI, platformNames, POINT, POINTS, POLYGON, POSTERIZE, PROBLEM, PROJECT, QUAD, QUAD_STRIP, QUADS, QUARTER_PI, R, RAD_TO_DEG, RADIUS, RECT, RED_MASK, REPLACE, RETURN, RGB, RIGHT, ROUND, SA, SB, SCREEN, SG, SHAPE, SHIFT, SHINE, SOFT_LIGHT, SPB, SPG, SPHERE, SPOT, SPR, SQUARE, SR, SUBTRACT, SW, TAB, TARGA, THIRD_PI, THRESHOLD, TIFF, TOP, TRIANGLE, TRIANGLE_FAN, TRIANGLE_STRIP, TRIANGLES, TWO_PI, TX, TY, TZ, U, UP, V, VERTEX_FIELD_COUNT, VW, VX, VY, VZ, WAIT, WHITESPACE, WINDOWS, X, Y, Z |
| Constructor Summary | |
|---|---|
MaxEntTagger(processing.core.PApplet p)
|
|
| Method Summary | |
|---|---|
static RiRemotable |
createRemote(java.util.Map params)
|
void |
destroy()
|
static MaxEntTagger |
getInstance()
|
static MaxEntTagger |
getInstance(processing.core.PApplet p)
|
boolean |
isAdjective(java.lang.String pos)
Returns true if word is an adjective. |
boolean |
isAdverb(java.lang.String pos)
Returns true if word is an adverb. |
boolean |
isNoun(java.lang.String pos)
Returns true if word is a noun. |
boolean |
isVerb(java.lang.String pos)
Returns true if word is a verb. |
static void |
main(java.lang.String[] args)
|
java.util.List |
tag(java.util.List tokens)
|
java.lang.String |
tag(java.lang.String sentence)
|
java.lang.String[] |
tag(java.lang.String[] tokens)
Returns a String array of the most probably tags |
java.lang.String[] |
tagFile(java.lang.String fileName)
Loads a file, splits the input into sentences and returns a String[] of the most probably tags. |
java.lang.String |
tagInline(java.lang.String toTag)
Returns a String with pos-tags notated inline |
java.lang.String |
tagInline(java.lang.String[] tokens)
Returns a String with pos-tags notated inline |
| Methods inherited from class rita.support.me.RiObjectME |
|---|
getModelDir, setModelDir |
| Methods inherited from class rita.RiObject |
|---|
dispose, getId, getPApplet, nextId |
| Methods inherited from class java.lang.Object |
|---|
equals, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
| Constructor Detail |
|---|
public MaxEntTagger(processing.core.PApplet p)
| Method Detail |
|---|
public static MaxEntTagger getInstance()
public static MaxEntTagger getInstance(processing.core.PApplet p)
public static RiRemotable createRemote(java.util.Map params)
public java.util.List tag(java.util.List tokens)
public java.lang.String tag(java.lang.String sentence)
public java.lang.String[] tag(java.lang.String[] tokens)
RiTaggerIF
tag in interface RiTaggerIFpublic java.lang.String tagInline(java.lang.String[] tokens)
RiTaggerIF
tagInline in interface RiTaggerIFpublic java.lang.String tagInline(java.lang.String toTag)
RiTaggerIF
tagInline in interface RiTaggerIFpublic void destroy()
destroy in class RiRemotablepublic boolean isVerb(java.lang.String pos)
RiTaggerIFword is a verb.
isVerb in interface RiTaggerIFpublic boolean isNoun(java.lang.String pos)
RiTaggerIFword is a noun.
isNoun in interface RiTaggerIFpublic boolean isAdverb(java.lang.String pos)
RiTaggerIFword is an adverb.
isAdverb in interface RiTaggerIFpublic boolean isAdjective(java.lang.String pos)
RiTaggerIFword is an adjective.
isAdjective in interface RiTaggerIFpublic java.lang.String[] tagFile(java.lang.String fileName)
RiTaggerIF
tagFile in interface RiTaggerIFpublic static void main(java.lang.String[] args)
|
|||||||||
| PREV CLASS NEXT CLASS | FRAMES NO FRAMES | ||||||||
| SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD | ||||||||