ACL RD-TEC 1.0 Summarization of W04-3208
Paper Title:
MINING VERY-NON-PARALLEL CORPORA: PARALLEL SENTENCE AND LEXICON EXTRACTION VIA BOOTSTRAPPING AND E
MINING VERY-NON-PARALLEL CORPORA: PARALLEL SENTENCE AND LEXICON EXTRACTION VIA BOOTSTRAPPING AND E
Authors: Pascale Fung and Percy Cheung
Primarily assigned technology terms:
- algorithm
- alignment learning
- bilingual lexicon extraction
- boosting
- bootstrapping
- bootstrapping method
- broadcasting
- classifier
- document matching
- em learning
- iterative bootstrapping
- learner
- learning
- learning methods
- lexical learning
- lexical matching
- lexicon extraction
- lexicon learning
- matching
- mining
- multilevel bootstrapping
- pair extraction
- parallel sentence extraction
- paraphrasing
- preprocessing
- sentence extraction
- transcription
- unsupervised method
- word alignment
Other assigned terms:
- approach
- bilingual corpora
- bilingual lexicon
- bilingual sentence
- case
- comparable corpora
- comparable document
- convergence
- corpora
- cosine similarity
- document
- document set
- english sentence
- estimation
- experimental results
- fact
- ibm model
- language pairs
- lexical information
- lexicon
- measures
- method
- model parameters
- monolingual corpora
- parallel corpus
- parallel sentence
- paraphrase
- paraphrases
- phrase
- precision
- process
- sentence
- sentence pair
- sentence similarity
- sentences
- similarity measures
- similarity scores
- tdt corpus
- translation candidates
- translations
- word
- words