ACL RD-TEC 1.0 Summarization of P96-1041
Paper Title:
AN EMPIRICAL STUDY OF SMOOTHING TECHNIQUES FOR LANGUAGE MODELING
AN EMPIRICAL STUDY OF SMOOTHING TECHNIQUES FOR LANGUAGE MODELING
Authors: Stanley F. Chen and Joshua Goodman
Primarily assigned technology terms:
- algorithm
- bigram training
- c + +
- frequency estimation
- grouping
- interpolation technique
- jelinek-mercer smoothing
- language modeling
- likelihood estimate
- linear interpolation
- maximum likelihood
- modeling
- n-gram language modeling
- optimization
- parameter optimization
- parameter search
- parameter selection
- parameter setting
- parsing
- part-of-speech tagging
- partitioning
- plus-one smoothing
- recognition
- search
- search algorithm
- smoothing
- smoothing method
- smoothing technique
- smoothing techniques
- speech recognition
- stochastic parsing
- tagging
Other assigned terms:
- acoustic signal
- bayesian framework
- bigram
- bigram model
- brown corpus
- case
- chunks
- concept
- corpora
- data consortium
- data sets
- distribution
- entropy
- estimation
- good-turing estimation
- implementation
- interpolation
- knowledge
- language model
- large corpus
- large training
- likelihood
- linguistic
- linguistic data
- linguistic data consortium
- maximum likelihood estimate
- measure
- method
- methodology
- n-gram
- n-gram models
- n-grams
- parameter settings
- parameter values
- part-of-speech
- performance evaluation
- perplexity
- phrase
- phrase attachment
- prepositional phrase
- prepositional phrase attachment
- probabilities
- probability
- recursion
- segments
- sentence
- sentences
- set size
- signal
- standard deviation
- technique
- term
- terms
- test data
- text
- tokens
- training
- training data
- training set
- training set size
- treebank
- trigram
- trigram model
- uniform distribution
- unigram
- unigram model
- vocabulary
- word
- words