ACL RD-TEC 1.0 Summarization of W03-1508
Paper Title:
TRANSLITERATION OF PROPER NAMES IN CROSS-LINGUAL INFORMATION RETRIEVAL
TRANSLITERATION OF PROPER NAMES IN CROSS-LINGUAL INFORMATION RETRIEVAL
Authors: Paola Virga and Sanjeev Khudanpur
Primarily assigned technology terms:
- algorithm
- automatic speech recognition
- automatic speech synthesis
- automatic system
- character encoding
- chinese-english translation
- combing unstructured text
- cross-lingual information retrieval
- cross-lingual retrieval
- data selection
- decoder
- decoding
- detection and tracking
- direct translation
- document retrieval
- document retrieval system
- editing
- encoding
- entity transliteration
- extrinsic evaluation
- finite state
- finite state transducers
- giza
- good-turing smoothing
- hidden markov
- hidden markov models
- indexing
- information retrieval
- information retrieval system
- intrinsic evaluation
- language model training
- learning
- learning algorithm
- linguistic processing
- machine translation
- model training
- mt system
- name transliteration
- named entity transliteration
- orthographic representation
- processing
- reading
- recognition
- recognition system
- reporting
- retrieval system
- retrieving
- smoothing
- speech processing
- speech recognition
- speech recognition system
- speech synthesis
- speech synthesis system
- speech-inspired translation
- spelling
- spoken document retrieval
- statistical machine translation
- statistical method
- statistical mt
- statistical translation
- syllabification
- synthesis
- synthesis system
- terminology
- text processing
- text retrieval
- text-to-speech
- text-to-speech system
- topic detection
- topic detection and tracking
- transcription
- transducers
- transformation-based learning
- translation model training
- translation system
- translingual retrieval
- transliteration
- transliteration process
- transliterator
Other assigned terms:
- approach
- arabic text
- back-transliteration
- break
- case
- character error rate
- character sequence
- characters
- chinese characters
- chinese text
- clusters
- document
- document collection
- document frequency
- document model
- domain-specific terminology
- edit distance
- english text
- english translation
- english translations
- error rate
- fact
- foreign language
- function words
- ibm model
- index
- intrinsic testing
- knowledge
- language model
- language models
- large corpus
- lexicon
- likelihood
- linguistic
- linguistic knowledge
- mapping
- maps
- markov models
- mean average precision
- measure
- method
- multi-lingual text
- n-grams
- name transliteration procedure
- named entity
- named entity corpus
- named-entity
- names
- noisy channel
- orthography
- parallel corpus
- phoneme
- phoneme sequence
- phonemes
- phonemic representation
- precision
- probability
- procedure
- process
- pronunciation
- proper names
- queries
- query
- recipe
- retrieval performance
- retrieval task
- seed
- sentence
- sentences
- source-channel model
- specialized terminology
- speech synthesis literature
- statistical approach
- statistical language model
- statistical model
- statistical significance
- statistical translation model
- statistics
- symbol
- symbols
- system description
- target language
- tdt corpus
- technical terms
- technique
- term
- terms
- test set
- text
- tokens
- tone
- toolkit
- topics
- training
- training corpus
- training data
- training set
- transformation
- transformation rules
- translation lexicon
- translation model
- translation models
- translation output
- translations
- transliteration performance
- transliteration procedure
- trigram
- trigram language model
- trigram model
- vocabulary
- word
- words