Amazon

Saturday 18 June 2011

Parsing

  • CYK (Cocke-Younger-Kasami). An efficient O(n3) algorithm for parsing any CNF context-free grammar.
  • Earley's algorithm. A chart parser, O(n3) algorithm for parsing any context-free grammar.
  • Inside-outside. An O(n3) algorithm for re-estimating production probabilities in probabilistic context-free grammars.

LL Parsers

Parse a LL context-free grammar top-down from left to right.
Such as ANTLR that is LL(*).

LR Parsers

Bottom-up parsers for context-free grammars.
  • Dijkstra's shunting yard algorithm is commonly used to implement operator precedence parsers which convert from infix notation to Reverse Polish notation (RPN).
  • LALR (Look-ahead LR). With a one-token look-ahead. Yacc/Bison use LALR(1)
  • SLR (Simple LR) parser. An LR(0) modified to prevent shift-reduce and reduce-reduce conflits. Remains inferior to LR(1).
  • Canonical LR parser or LR(1) parser. Has a look-ahead of one token.
  • GLR. (Generalyzed LR parser) by Masaru Tomita. An extension of an LR to handle nondeterministic or ambiguous grammars. It is efficient to parse natural language.
  • Recursive Descent Parsers

    Top-down parsers built from a set of mutually-recursive procedures that represent the production rules of the grammar.
  • Packrat parser. A linear time parsing algorithm supporting context-free LL(k) grammars. Use backup and memoization (remembering its choices) to avoid non-termination.

Prediction (statistics)

  • Baum-Welch. Finds the unknown parameters of a Hidden Markov Model (HMM). It makes use of the forward-backward algorithm.
  • Viterbi. Calculates the Viterbi path, a sequence of states that is most luckily to appear in a sequence of event.