edit upload recent print
Search

                                          



Main » Grammars

Context Free Grammars (CFGs)

So what is a 'context-free' grammar or CFG? In essence, its simply a set of rules (or 'productions') which tell us how a 'sentence' (or word, phrase, paragraph, poem, novel, etc.) can be formed. If we are given a sentence, a CFG can also tell us whether it is grammatical (that is, syntactically correct) or not. Most important for our purposes, if we have a valid CFG (we'll discuss what this means in a moment), we can use it to generate new 'sentences' accorded to its rules. Here's a simple context-free grammar for a small fragment of English:

s -> np vp
np -> det n
vp -> v
vp -> v np
det -> 'a'
det -> 'the'
n -> 'woman'
n -> 'man'
v -> 'shoots'

(Its important to note that while we can represent portions of natural languages (like English) using CFGs, they are generally too complex on larger scales. Such 'natural' languages are often called 'context-sensitive'.)

So, what are the ingredients of the little grammar above? Well, first note that it contains three types of symbols:

  • First, there's the -> symbol, which is used to define the rules.
  • Next there are symbols like these: s, np, vp, det, n, v. These symbols are called non-terminal symbols; we'll soon learn why. Here, s is short for sentence, np is short for noun phrase, vp is short for verb phrase, and det is short for determiner. Here, that is, each of these symbols is shorthand for a grammatical category, though they can stand for anything you want.
  • Finally there are the symbols in quotations: 'a', 'the', 'woman', 'man', and 'shoots'. A computer scientist might call these terminal symbols, and linguists might call them 'lexical items'; we'll generaly just call these words or strings.

Now, the grammar above contains nine rules. A context-free rule (by definition) consists of a single non-terminal symbol, followed by ->, followed by a finite sequence made up of terminal (strings) and/or non-terminal symbols. All nine items listed above have this form, so they are all legitimate context-free rules. What do these rules mean? They tell us how different grammatical categories can be built up.

Read -> as 'can consist of', or 'can be built out of'. For example, the first rule tells us that a sentence (s) can consist of a noun phrase (np) followed by a verb phrase (vp). The third rule tells us that a verb phrase (vp) can consist of a verb (v) followed by a noun phrase (np), while the fourth rule tells us that there is another way to build a verb phrase (vp): simply use a verb (v). The last five rules tell us that 'a' and 'the' are determiners, that 'man' and 'woman' are nouns, and that 'shoots' is a verb.

Here is the same grammar (with one change):

s -> np vp
np -> det n
vp -> v | v np
det -> 'a'
det -> 'the'
n -> 'woman'
n -> 'man'
v -> 'shoots'

Note that where before we had 2 rules that started with 'vp' on the left-side, we now have one. On the right-side of this one 'vp' rule, we find: v np | v, which simply means a verb (v) followed by a noun-phrase (np) OR (shown by a '|' character) a verb (v). So we combined 2 rules into one using the OR operator. This grammar is more compact (8 rules vs. 9 above), but otherwise fully equivalent to the one above. Notice we can also do the same for the 'det' and 'n' rules.

s -> np vp
np -> det n
vp -> v | v np
det -> 'a' | 'the'
n -> 'woman' | 'man'
v -> 'shoots'

So, how can we use such a grammar to generate interesting text?

Well, we simply start from the top (or start-symbol) and follow the rules, make substitutions as we go. Any time we come to an OR symbol, we randomly pick one of the choices (here I'll just take the first choice). So, for example, we might proceed as follows, starting from symbol s.

                          s

                       np   vp 

                    det   n      v 

                  'a' 'woman' 'shoots'

Of course, we can repeat this process as many times as we want, making other 'random choices' and getting other variations.

Now lets look at some examples in Processing using pText

Ok, so how would we tranlate the grammar above into the pText/PGrammar format? Give it a try...

example grammar in pText format


Other Types of Grammars

When Noam_Chomsky first formalized generative grammars in 1956, he classified them into types now known as the Chomsky_hierarchy. The difference between these types is that they have increasingly strict production rules and can express fewer formal languages. Some important grammar 'types' include'context-sensitive' grammars, 'context-free' grammars (see above) and 'regular' grammars. The languages that can be described with such a grammar are called context-sensitive, context-free, and regular languages, respectively. The latter two are the types of grammars most often used in natural_language_processing because they, as opposed to context-sensitive grammars like English or Chinese, can be efficiently implemented in software (see pText's PGrammar object).

Context-free vs. Context-sensitive

Context-free grammars those in which the left hand side of a production rule is formed by a single non-terminal symbol. The following are context-free grammar rules:

    * (A → B); which means replace A with B
    * (A → B); (B → AB); which means replace A with B, then replace B with AB

Context-sensitive grammars are those in which production rules can depend on their neighbors. The following are context-sensititive rules:

    * (A <B> A → A); which means that if B is surrounded by two A's replace B with A
    * (A <B> B → B); which means that if B is preceded by A and followed by B leave B alone

Deterministic vs. Probabilistic

Deterministic grammars have one production rule for each symbol. Probabilistic (or stochastic) grammars may contain several production rules for a symbol, each of which may be selectedaccording to probability values. For instance, the rules

    * (A → B) with P = 0.75 or (A → AB) with P = 0.25

means that each time A is obtained during the iterative process, a random choice must be made as whether to replace A with B or with AB based on probability P, so that the 1st rule will be selected 3 times as frequently (on average) as the 2nd.


Instructor: Daniel C. Howe

E-Writing Resources

Page last modified on March 06, 2007, at 01:56 PM EST