ACL RD-TEC 1.0 Summarization of P04-3005
Paper Title:
CUSTOMIZING PARALLEL CORPORA AT THE DOCUMENT LEVEL
CUSTOMIZING PARALLEL CORPORA AT THE DOCUMENT LEVEL
Authors: Monica Rogati and Yiming Yang
Primarily assigned technology terms:
- algorithm
- alignment algorithm
- approximation
- clir method
- corpus-based approach
- cross-lingual information retrieval
- cross-lingual retrieval
- database
- document selection
- information retrieval
- machine translation
- matching
- measuring
- mining
- parameter tuning
- query expansion
- query translation
- ranking
- retrieving
- statistical machine translation
- thresholding
- tuning
- weighting
Other assigned terms:
- approach
- case
- cluster
- corpora
- document
- document length
- domain corpus
- evaluation data
- evaluations
- genre
- ibm model
- mean average precision
- measure
- measures
- method
- mutual information
- noise
- opinions
- oracle
- parallel corpora
- parallel corpus
- parameter values
- pointwise mutual information
- precision
- probabilities
- process
- queries
- query
- query vector
- sentence
- sentences
- similarity metrics
- similarity score
- source language
- target language
- terms
- test collection
- test corpus
- test set
- training
- training corpora
- training data
- training documents
- training set
- translation model
- translation probabilities
- translation quality
- words