Texts

Saturday 18 June 2011

00:21 fds dsf

Aho-Corasick. Search in a text by building a table from words.
Bitap (or shift-or, shift-and, Baeza-Yates-Gonnet). Fuzzy string searching algorithm developed by Udi Manber and Sun Wu.
Boyer-Moore string search. Search in text by skipping sub-string not containing letters in the searched input.
Knuth-Morris-Pratt. Build a table when searching to skip sub-string.
Rabin-Karp string search. Use hashing for multiple searches.
Longest common subsequence problem. Haskell's algorithm. Of two sequences.
Longest increasing subsequence problem. Of two sequences. It also reduces to find the longest path in a directed acyclic graph.
Shortest common supersequence. Of two sequences.
Horspool. Simplification of the Boyer-Moore algorithm. O(mn).

Levenshtein distance (or edit distance). Minimum number of operations (insertion, deletion, replacement) needed to transform one string into the other.
Soundex. Phonetic algorithm for indexing words by their sound (in English).
Metaphone. Indexing words by their sound (in English).
NYSIIS. (New York State Identification and Intelligence System). Phonetic algorithm that improves soundex.

Latent Dirichlet Allocation (LDA). Analysis of documents to associate the content with a topic. Used by search engines.
Latent Semantic Indexing (LSI). Automation of methods to attach a text to a topic from the words that occur commonly in this context.
Stemming. A method of reducing words to their stem, or root form.

Posted in: Algorithm

ScienceHack