ACL RD-TEC 1.0 Summarization of W02-0902
Paper Title:
LEARNING A TRANSLATION LEXICON FROM MONOLINGUAL CORPORA
LEARNING A TRANSLATION LEXICON FROM MONOLINGUAL CORPORA
Authors: Philipp Koehn and Kevin Knight
Primarily assigned technology terms:
- algorithm
- automatic construction
- bootstrap
- computing
- exhaustive search
- greedy search
- information retrieval
- internet
- learning
- levenshtein
- machine translation
- machine translation system
- machine translation systems
- matching
- parallelization
- search
- spelling
- tf\/idf
- translation system
- translation systems
- vector comparison
Other assigned terms:
- aligned parallel corpus
- approach
- benchmark
- bilingual lexicon
- case
- co-occurrence
- comparable corpora
- concepts
- context information
- context similarity
- context vector
- context vectors
- context window
- context words
- corpora
- correlation
- dictionary
- edit distance
- experimental results
- fact
- foreign language
- german corpus
- implementation
- levenshtein distance
- lexical entries
- lexicon
- lexicon entries
- likelihood
- linear combination
- local context
- mapping
- mappings
- measure
- method
- monolingual corpora
- nouns
- pairs of words
- parallel corpus
- parallel text
- part of speech
- process
- rank order
- seed
- seed words
- sentence
- similarity matrix
- similarity measure
- similarity scores
- statistics
- string edit distance
- string similarity
- target language
- target word
- terms
- test corpus
- text
- tokens
- transformation
- transformation rules
- translation accuracy
- translation lexicon
- translations
- vowel
- word
- word frequencies
- word frequency
- word pair
- word similarity
- word window
- words
- wsj corpus