ACL RD-TEC 1.0 Summarization of W03-0416
Paper Title:
AN EFFICIENT CLUSTERING ALGORITHM FOR CLASS-BASED LANGUAGE MODELS
AN EFFICIENT CLUSTERING ALGORITHM FOR CLASS-BASED LANGUAGE MODELS
Authors: Takuya Matsuzaki and Yusuke Miyao and Jun'ichi Tsujii
Primarily assigned technology terms:
- agglomerative method
- algorithm
- bottom-up clustering
- class-based language modeling
- classification
- classification method
- clustering
- clustering algorithm
- cross validation
- dependency analysis
- disambiguation
- exhaustive search
- greedy algorithm
- hard clustering
- iterative clustering
- iterative-classification
- japanese dependency analysis
- k-means
- k-means style exchange
- language modeling
- language processing
- learning
- learning algorithm
- learning process
- likelihood estimation
- maximum likelihood
- maximum likelihood estimation
- model merging
- model selection
- modeling
- natural language processing
- nlp
- optimization
- optimization algorithm
- optimization method
- processing
- search
- sense disambiguation
- smoothing
- smoothing techniques
- soft clustering
- structural disambiguation
- top-down approach
- top-down clustering
- validation
- word sense disambiguation
Other assigned terms:
- 10-fold cross validation
- adjective
- approach
- array
- bayesian model
- bunsetsu
- bunsetsus
- case
- case frame
- class-based model
- clustering model
- clusters
- co-occurrence
- compound noun
- content words
- corpora
- data sets
- dependency relation
- dependency relations
- dependency structure
- disambiguation task
- estimation
- evaluation experiment
- evaluation task
- frame
- function words
- heuristic
- intention
- interpretation
- japanese corpus
- japanese dependency
- joint probability
- joint probability model
- knowledge
- language model
- language models
- language processing tasks
- large corpora
- lexical information
- lexical knowledge
- likelihood
- linguistic
- linguistic phenomena
- log-likelihood
- mdl principle
- measure
- method
- model-theoretic algorithm
- natural language
- natural language processing tasks
- nlp tasks
- nouns
- optimization problem
- part-of-speech
- precision
- probabilities
- probability
- probability model
- procedure
- process
- processing tasks
- relation
- sentence
- sparse data
- statistical language model
- statistical model
- structure of a sentence
- style
- term
- termination condition
- terms
- test data
- time complexity
- training
- training data
- training samples
- training set
- verb
- vocabulary
- word
- word classes
- word sense
- words