ACL RD-TEC 1.0 Summarization of P01-1068
Paper Title:
MULTI-CLASS COMPOSITE N-GRAM LANGUAGE MODEL FOR SPOKEN LANGUAGE PROCESSING USING MULTIPLE WORD CLUSTERS
MULTI-CLASS COMPOSITE N-GRAM LANGUAGE MODEL FOR SPOKEN LANGUAGE PROCESSING USING MULTIPLE WORD CLUSTERS
Authors: Hirofumi Yamamoto and Shuntaro Isogai and Yoshinori Sagisaka
Primarily assigned technology terms:
- approximation
- automatic extraction
- back-off smoothing
- class assignment
- classification
- clustering
- continuous speech recognition
- database
- error reduction
- grouping
- language processing
- maximum entropy
- maximum entropy method
- processing
- recognition
- smoothing
- speech recognition
- spoken language processing
- statistical analysis
- word classification
- word clustering
- word prediction
- word-clustering
Other assigned terms:
- back-off parameter
- case
- cluster
- clusters
- community
- conditional word
- continuous speech
- data set
- data sparseness
- data sparseness problem
- entropy
- error rate
- estimation
- euclidean distance
- evaluation data
- fact
- french
- knowledge
- language database
- language model
- language models
- lexicon
- measure
- method
- model size
- n-gram
- n-gram language model
- n-grams
- part-of-speech
- parts-ofspeech
- perplexity
- pos information
- priori
- probabilities
- probability
- process
- sparse data
- sparseness problem
- spoken language
- statistical language model
- statistical models
- target word
- term
- text
- text corpus
- training
- training data
- training set
- transition probabilities
- transition probability
- vocabulary
- vocabulary size
- word
- word classes
- word connectivity
- word error rate
- word pair
- word sequence
- word sequences
- words