ACL RD-TEC 1.0 Summarization of H05-1046
Paper Title:
DISAMBIGUATING TOPONYMS IN NEWS
DISAMBIGUATING TOPONYMS IN NEWS
Authors: Eric Garbin and Inderjeet Mani
Primarily assigned technology terms:
- automatic content extraction
- bootstrapping
- classification
- classifier
- computational linguistics
- cross-validation
- crossvalidation
- database
- disambiguation
- geographic information systems
- harvesting
- human language
- human language technology
- information systems
- internet
- language processing
- language technology
- learner
- learning
- learning approach
- learning approaches
- lexical lookup
- location normalization
- machine learner
- machine learning
- machine learning approach
- machine learning approaches
- matching
- name disambiguation
- natural language processing
- normalization
- pattern-matching
- processing
- reasoning
- spatial reasoning
- supervised machine learning
- tagger
- tagging
- ten-fold cross-validation
- unsupervised approach
- weka
- word-sense disambiguation
Other assigned terms:
- abbreviation
- abbreviations
- adjunct
- ambiguity
- annotated corpora
- annotated corpus
- annotation
- approach
- association for computational linguistics
- case
- class information
- cluster
- confusion matrix
- corpora
- data model
- data sparseness
- disambiguation task
- discourse
- discourse topic
- distribution
- document
- f-measure
- feature
- feature sets
- feature vector
- feature vectors
- gazetteer
- geographic information
- grounding
- heuristic
- heuristics
- information gain
- knowledge
- likelihood
- linguistics
- mapping
- measure
- method
- mutual information
- names
- natural language
- natural language texts
- news corpus
- noise
- person names
- personal pronouns
- pointwise mutual information
- precision
- prepositions
- pronouns
- semantic
- semantic similarity
- set size
- statistical approach
- tagged corpus
- technology
- term
- terms
- test data
- test set
- text
- tokens
- training
- training data
- training set
- training set size
- window size
- word
- word window
- words