ACL RD-TEC 1.0 Summarization of W02-0603
Paper Title:
UNSUPERVISED DISCOVERY OF MORPHEMES
UNSUPERVISED DISCOVERY OF MORPHEMES
Authors: Mathias Creutz and Krista Lagus
Primarily assigned technology terms:
- algorithm
- analyzer
- batch learning
- boundary prediction
- coding
- computing
- data compression
- dynamic programming
- em algorithm
- evaluation procedure
- expectation-maximization
- finite-state transducer
- forward-backward algorithm
- greedy search
- greedy search algorithm
- induction
- language modeling
- learning
- learning algorithm
- many-to-one mapping
- matching
- maximum likelihood
- maximum-likelihood
- model optimization
- modeling
- morphological analysis
- morphology
- morphology discovery
- on-line learning
- one-to-one mapping
- online learning
- online search
- optimization
- parser
- processing
- reading
- recognition
- recursive mdl
- recursive segmentation
- restrictive segmentation
- retrieving
- search
- search algorithm
- searching
- segmentation
- segmentation algorithm
- segmentation method
- speech recognition
- splitting
- statistical language modeling
- string matching
- text segmentation
- transducer
- two-level morphology
- unsupervised learning
- unsupervised segmentation
- viterbi
- viterbi algorithm
- viterbi alignment
- word discovery
Other assigned terms:
- adjective
- adverb
- affix
- affixes
- alignment procedure
- allomorphy
- approach
- binary tree
- case
- characters
- chunk
- chunks
- codebook
- complex word
- compound words
- computational phonology
- conditional probability
- corpora
- correlation
- data set
- data sets
- data structure
- distance measure
- distribution
- english corpus
- entropy
- evaluation measures
- evaluation method
- fact
- hierarchical structure
- inflectional morphology
- information theory
- leaf
- lexicon
- likelihood
- linguistic
- linguistic measure
- linguistic theory
- mapping
- mappings
- meanings
- measure
- measures
- memory consumption
- method
- minimum description length
- model complexity
- model structure
- morph
- morpheme
- morpheme boundary
- morphemes
- morphological structure
- natural language
- natural language text
- nouns
- part-of-speech
- part-of-speech tags
- poisson distribution
- precision
- probabilistic model
- probabilities
- probability
- probability estimates
- procedure
- punctuation
- random order
- running time
- segments
- semantic
- semantic content
- semantic representation
- sentences
- source text
- sparse data
- stem
- stems
- suffix
- suffixes
- tag set
- tags
- term
- terms
- test data
- test set
- text
- theory
- tokens
- training
- training data
- training set
- tree
- tree structure
- untagged corpora
- utterance
- verb
- vocabulary
- word
- word boundaries
- word boundary
- word form
- word type
- word types
- words