rita
Class RiGoogleSearch

java.lang.Object
  extended by rita.RiObject
      extended by rita.RiGoogleSearch
All Implemented Interfaces:
processing.core.PConstants, RiSearcherIF, RiConstants

public class RiGoogleSearch
extends RiObject
implements RiSearcherIF

A utility object for obtaining unigram, bigram, and weighted-bigram counts for words and phrases via the Google search engine.

      RiGoogleSearch gp = new RiGoogleSearch(this);
      float f = gp.getBigram("canid", "ski'd");


Field Summary
static boolean DBUG
           
static boolean DBUG_FETCH
           
 
Fields inherited from interface rita.support.RiConstants
BEHAVIOR_COMPLETED, BOUNDING_BOX_ALPHA, BRILL_POS_TAGGER, EASE_IN, EASE_IN_CUBIC, EASE_IN_EXPO, EASE_IN_OUT, EASE_IN_OUT_CUBIC, EASE_IN_OUT_EXPO, EASE_IN_OUT_QUARTIC, EASE_IN_OUT_SINE, EASE_IN_QUARTIC, EASE_IN_SINE, EASE_OUT, EASE_OUT_CUBIC, EASE_OUT_EXPO, EASE_OUT_QUARTIC, EASE_OUT_SINE, ESS, FADE_COLOR, FADE_IN, FADE_OUT, FADE_TO_TEXT, FIRST_PERSON, FUTURE_TENSE, ID, LERP, LINEAR, MAXENT_POS_TAGGER, MINIM, MOVE, MUTABLE, PAST_TENSE, PHONEME_BOUNDARY, PHONEMES, PLING_STEMMER, PLURAL, PORTER_STEMMER, POS, PRESENT_TENSE, SCALE_TO, SECOND_PERSON, SENTENCE_BOUNDARY, SINGULAR, SONIA, SPEECH_COMPLETED, STRESSES, SYLLABLE_BOUNDARY, SYLLABLES, TEXT, TEXT_ENTERED, THIRD_PERSON, TIMER, TIMER_COMPLETED, TIMER_TICK, TOKENS, UNKNOWN, WORD_BOUNDARY
 
Fields inherited from interface processing.core.PConstants
A, AB, ADD, AG, ALPHA, ALPHA_MASK, ALT, AMBIENT, AR, ARC, ARGB, ARROW, B, BACKSPACE, BASELINE, BEEN_LIT, BEVEL, BLEND, BLUE_MASK, BLUR, BOTTOM, BOX, BURN, CENTER, CENTER_DIAMETER, CENTER_RADIUS, CHATTER, CLOSE, CMYK, CODED, COMPLAINT, CONTROL, CORNER, CORNERS, CROSS, CUSTOM, DA, DARKEST, DB, DEG_TO_RAD, DELETE, DG, DIAMETER, DIFFERENCE, DILATE, DIRECTIONAL, DISABLE_ACCURATE_TEXTURES, DISABLE_DEPTH_SORT, DISABLE_DEPTH_TEST, DISABLE_OPENGL_2X_SMOOTH, DISABLE_OPENGL_ERROR_REPORT, DODGE, DOWN, DR, DXF, EB, EDGE, EG, ELLIPSE, ENABLE_ACCURATE_TEXTURES, ENABLE_DEPTH_SORT, ENABLE_DEPTH_TEST, ENABLE_NATIVE_FONTS, ENABLE_OPENGL_2X_SMOOTH, ENABLE_OPENGL_4X_SMOOTH, ENABLE_OPENGL_ERROR_REPORT, ENTER, EPSILON, ER, ERODE, ERROR_BACKGROUND_IMAGE_FORMAT, ERROR_BACKGROUND_IMAGE_SIZE, ERROR_PUSHMATRIX_OVERFLOW, ERROR_PUSHMATRIX_UNDERFLOW, ERROR_TEXTFONT_NULL_PFONT, ESC, EXCLUSION, G, GIF, GRAY, GREEN_MASK, HALF_PI, HAND, HARD_LIGHT, HINT_COUNT, HSB, IMAGE, INVERT, JAVA2D, JPEG, LEFT, LIGHTEST, LINE, LINES, LINUX, MACOSX, MAX_FLOAT, MAX_INT, MIN_FLOAT, MIN_INT, MITER, MODEL, MULTIPLY, NORMAL, NORMALIZED, NX, NY, NZ, OPAQUE, OPEN, OPENGL, ORTHOGRAPHIC, OTHER, OVERLAY, P2D, P3D, PATH, PDF, PERSPECTIVE, PI, platformNames, POINT, POINTS, POLYGON, POSTERIZE, PROBLEM, PROJECT, QUAD, QUAD_STRIP, QUADS, QUARTER_PI, R, RAD_TO_DEG, RADIUS, RECT, RED_MASK, REPLACE, RETURN, RGB, RIGHT, ROUND, SA, SB, SCREEN, SG, SHAPE, SHIFT, SHINE, SOFT_LIGHT, SPB, SPG, SPHERE, SPOT, SPR, SQUARE, SR, SUBTRACT, SW, TAB, TARGA, THIRD_PI, THRESHOLD, TIFF, TOP, TRIANGLE, TRIANGLE_FAN, TRIANGLE_STRIP, TRIANGLES, TWO_PI, TX, TY, TZ, U, UP, V, VERTEX_FIELD_COUNT, VW, VX, VY, VZ, WAIT, WHITESPACE, WINDOWS, X, Y, Z
 
Constructor Summary
RiGoogleSearch()
           
RiGoogleSearch(processing.core.PApplet pApplet)
           
RiGoogleSearch(java.lang.String googleCookie)
          Allows for a custom cookie String, generally of the format: "PREF=ID=ee8b4e3d4e15d9f5:TM=1219349742:LM=1219349742:S=MGXvStJPax5onGxv;"...
 
Method Summary
 float getBigram(java.lang.String word1, java.lang.String word2)
          Returns the bigram coherence for the word pair where coherence(w1, w2) = count(w1 + w2)/(count(w1) + count(w2)) [from Gervas]
 float getBigramAvg(java.util.List sentence)
          Returns the avg value of all bigram pairs in the sentence.
 float getBigramAvg(java.lang.String[] sentence)
          Returns the avg value of all bigram pairs in the sentence.
 float getBigramMin(java.util.List sentence)
          Returns the min value of all bigram pairs in the sentence.
 float getBigramMin(java.lang.String[] sentence)
          Returns the min value of all bigram pairs in the sentence.
 int getCallCount()
          Returns the number of live URL connections made by this object so far
 java.lang.String getCookie()
          Returns the cookie string used in the last sent query
 int getCount(java.lang.String query)
          Returns the number of hits via Google for the search query.
 float getTrigram(java.lang.String word1, java.lang.String word2, java.lang.String word3)
          Returns the trigram coherence for the word pair where trigram-coherence (w1, w2, w3) = count(w1 + w2 + w3) / (getBigram(w1,w2) + getBigram(w2,w3)))
 java.lang.String getUserAgent()
          Returns the current user-agent
 float getWeightedBigram(java.util.List sentence)
          Returns the weighted value of all bigram pairs in the sentence.
 float getWeightedBigram(java.lang.String[] sentence)
          Returns the product of the avg value of all bigram pairs and the min bigram value in the sentence.
 float getWeightedUnigram(java.lang.String query)
          Returns the product of the count of the query and the # of words.
 float getWeightedUnigram(java.lang.String[] words)
          Returns the product of the count of the query and the # of words.
static boolean isCacheEnabled()
          Returns whether the cache is enabled
static void main(java.lang.String[] args)
           
static void setCacheEnabled(boolean enableCache)
          Sets whether the cache is enabled and duplicate requests are returned immediately rather than re-contacting google (default=true).
 void setCookie(java.lang.String googleCookie)
          Sets the cookie string for subsequent requests
 void setLocalCookiePath(java.lang.String path)
          Allows the google cookie to be automatically loaded from the file system.
 void setUserAgent(java.lang.String userAgent)
          Sets the user-agent for subsequent requests
 void useGoogleBooks(boolean b)
          if set to true, searches are restricted to google books
 
Methods inherited from class rita.RiObject
dispose, getId, getPApplet, nextId
 
Methods inherited from class java.lang.Object
equals, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Field Detail

DBUG

public static boolean DBUG
Invisible:

DBUG_FETCH

public static boolean DBUG_FETCH
Invisible:
Constructor Detail

RiGoogleSearch

public RiGoogleSearch()
Invisible:

RiGoogleSearch

public RiGoogleSearch(processing.core.PApplet pApplet)

RiGoogleSearch

public RiGoogleSearch(java.lang.String googleCookie)
Allows for a custom cookie String, generally of the format: "PREF=ID=ee8b4e3d4e15d9f5:TM=1219349742:LM=1219349742:S=MGXvStJPax5onGxv;"... to prevent some 403 errors for non-cookie carrying requests.

Invisible:
Method Detail

setLocalCookiePath

public void setLocalCookiePath(java.lang.String path)
Allows the google cookie to be automatically loaded from the file system. As an example, the path for Safari is generally: '/Users/$userName/Library/Cookies/Cookies.plist'


getCallCount

public int getCallCount()
Returns the number of live URL connections made by this object so far

See Also:
setCacheEnabled(boolean)

getTrigram

public float getTrigram(java.lang.String word1,
                        java.lang.String word2,
                        java.lang.String word3)
Returns the trigram coherence for the word pair where trigram-coherence (w1, w2, w3) = count(w1 + w2 + w3) / (getBigram(w1,w2) + getBigram(w2,w3)))

Invisible:

getBigram

public float getBigram(java.lang.String word1,
                       java.lang.String word2)
Returns the bigram coherence for the word pair where coherence(w1, w2) = count(w1 + w2)/(count(w1) + count(w2)) [from Gervas]

Specified by:
getBigram in interface RiSearcherIF

getWeightedUnigram

public float getWeightedUnigram(java.lang.String query)
Returns the product of the count of the query and the # of words.

Specified by:
getWeightedUnigram in interface RiSearcherIF

getWeightedUnigram

public float getWeightedUnigram(java.lang.String[] words)
Returns the product of the count of the query and the # of words.

Specified by:
getWeightedUnigram in interface RiSearcherIF

getWeightedBigram

public float getWeightedBigram(java.lang.String[] sentence)
Returns the product of the avg value of all bigram pairs and the min bigram value in the sentence. Equivalent to ( but more efficient than): getBigramAvg(s) * getBigramMin(s)

Specified by:
getWeightedBigram in interface RiSearcherIF

getBigramAvg

public float getBigramAvg(java.lang.String[] sentence)
Returns the avg value of all bigram pairs in the sentence.

Specified by:
getBigramAvg in interface RiSearcherIF

getBigramMin

public float getBigramMin(java.lang.String[] sentence)
Returns the min value of all bigram pairs in the sentence.

Specified by:
getBigramMin in interface RiSearcherIF

getWeightedBigram

public float getWeightedBigram(java.util.List sentence)
Returns the weighted value of all bigram pairs in the sentence.


getBigramAvg

public float getBigramAvg(java.util.List sentence)
Returns the avg value of all bigram pairs in the sentence.


getBigramMin

public float getBigramMin(java.util.List sentence)
Returns the min value of all bigram pairs in the sentence.


getCount

public int getCount(java.lang.String query)
Returns the number of hits via Google for the search query. To obtain an exact match, place your query in quotes, e.g.
   int k = gp.getCount("\"attained their momentum\"");
 

Specified by:
getCount in interface RiSearcherIF
Parameters:
query - The string to be searched for.
Returns:
The number of hits Google returned for the search query.

isCacheEnabled

public static boolean isCacheEnabled()
Returns whether the cache is enabled


setCacheEnabled

public static void setCacheEnabled(boolean enableCache)
Sets whether the cache is enabled and duplicate requests are returned immediately rather than re-contacting google (default=true).


getUserAgent

public java.lang.String getUserAgent()
Returns the current user-agent

Specified by:
getUserAgent in interface RiSearcherIF

setUserAgent

public void setUserAgent(java.lang.String userAgent)
Sets the user-agent for subsequent requests

Specified by:
setUserAgent in interface RiSearcherIF

getCookie

public java.lang.String getCookie()
Returns the cookie string used in the last sent query


setCookie

public void setCookie(java.lang.String googleCookie)
Sets the cookie string for subsequent requests

Specified by:
setCookie in interface RiSearcherIF

useGoogleBooks

public void useGoogleBooks(boolean b)
if set to true, searches are restricted to google books


main

public static void main(java.lang.String[] args)