ACL RD-TEC 1.0 Summarization of W95-0109
Paper Title:
AUTOMATIC CONSTRUCTION OF A CHINESE ELECTRONIC DICTIONARY
AUTOMATIC CONSTRUCTION OF A CHINESE ELECTRONIC DICTIONARY
Authors: Jing-Shin Chang and Yi-Chung Lin and Keh-Yih Su
Primarily assigned technology terms:
- algorithm
- automatic construction
- automatic segmentation
- automatic training
- chinese electronic dictionary construction system
- classification
- classification method
- classifier
- dictionary construction
- dictionary construction system
- electronic dictionary
- electronic dictionary construction system
- em algorithm
- error analysis
- extraction system
- identification
- learning
- learning approach
- lexical tagging
- lexicon acquisition
- measuring
- optimization
- part-of-speech tagging
- part-of-speech tagging system
- pos extraction
- pos tagging
- probabilistic segmentation
- processing
- ratio test
- reestimation
- scoring
- scoring function
- segmentation
- segmentation and pos tagging
- supervised learning
- supervised learning approach
- tagging
- tagging process
- tagging system
- training procedure
- training process
- two-class classification
- two-class classifier postfiltering
- unsupervised approach
- unsupervised learning
- unsupervised reestimation
- viterbi
- viterbi reestimation
- viterbi training
- vtw reestimation
- weighting
- word extraction
- word extraction system
- word identification
- word segmentation
- word segmentation and pos tagging
Other assigned terms:
- acquisition task
- approach
- bigram
- break
- case
- characters
- chinese characters
- chinese text
- chinese text corpus
- chinese words
- chunks
- classification model
- classification task
- compounds
- contextual information
- corpus size
- decision rule
- dictionaries
- dictionary
- dictionary entry
- distribution
- entropy
- estimation
- fact
- feature
- feature vector
- generation
- human intervention
- identification module
- information measure
- joint probability
- language model
- large corpus
- learning strategy
- lexicographer
- lexicon
- lexicon entries
- lexicon entry
- likelihood
- log-likelihood
- log-likelihood ratio
- measure
- measures
- method
- mutual information
- n-gram
- n-grams
- names
- part of speech
- part-of-speech
- parts of speech
- performance evaluation
- pos tag
- precision
- probabilistic approach
- probabilities
- probability
- procedure
- process
- proper names
- punctuation
- seed
- segmentation pattern
- sentence
- sentences
- small tagged seed corpus
- speech information
- speech tag
- statistics
- substring
- system performance
- tagged corpus
- tagged text
- tagging accuracy
- tags
- tagset
- technique
- terms
- text
- text corpus
- topology
- training
- transition probabilities
- trigram
- untagged corpus
- untagged text
- word
- word association
- word candidate
- word dictionary
- word lists
- word model
- words