ACL RD-TEC 1.0 Summarization of N06-1062
Paper Title:
UNLIMITED VOCABULARY SPEECH RECOGNITION FOR AGGLUTINATIVE LANGUAGES
UNLIMITED VOCABULARY SPEECH RECOGNITION FOR AGGLUTINATIVE LANGUAGES
Authors: Mikko Kurimo and Antti Puurula and Ebru Arisoy and Vesa Siivola and Teemu Hirsimäki and Janne Pylkkönen and Tanel Alumäe and Murat Saraclar
Primarily assigned technology terms:
- algorithm
- analyzer
- clustering
- coding
- continuous speech recognition
- database
- decision-tree
- decoder
- decoding
- hidden markov
- hidden markov models
- hmms
- information processing
- information retrieval
- information technology
- interpolated kneser-ney smoothing
- kneser-ney smoothing
- language model pruning
- language model training
- language modeling
- language modeling approach
- large vocabulary continuous speech recognition
- large vocabulary speech recognizer
- learning
- learning algorithm
- learning method
- learning methods
- lvcsr
- machine learning
- maximum likelihood
- mean subtraction
- model pruning
- model training
- modeling
- morpheme discovery
- morphological analyzer
- morphological analyzers
- n-gram estimation
- n-gram modeling
- n-gram training
- network construction
- optimization
- processing
- pruning
- recognition
- recognition systems
- recognizer
- sampling
- search
- smoothing
- speech and information retrieval
- speech recognition
- speech recognition systems
- speech recognizer
- splitting
- training algorithm
- unsupervised learning
- viterbi
- viterbi search
- word splitting
Other assigned terms:
- acoustic model
- acoustic models
- agglutinative language
- ambiguity
- approach
- break
- broadcast news
- case
- community
- compounding
- continuous speech
- conversational speech
- corpora
- corpus size
- data sparsity
- duration
- entropy
- error rate
- estimation
- fact
- feature
- foreign words
- free word order
- knowledge
- language model
- language models
- language portability
- large vocabulary speech
- lattice
- lattices
- lexica
- lexicon
- likelihood
- markov models
- measure
- memory consumption
- method
- minimum description length
- model complexity
- model size
- morph
- morpheme
- morphemes
- morphological rules
- n-gram
- n-gram language model
- n-gram model
- n-gram models
- n-grams
- names
- nist
- orthography
- phoneme
- phonemes
- portability
- prefixes and suffixes
- probabilities
- probability
- pronunciation
- recognition accuracy
- recognition errors
- recognition task
- sentence
- sentences
- speech data
- speech recognition accuracy
- speech recognition errors
- speech recognition task
- statistical language model
- statistics
- stems
- suffix
- suffixes
- symbol
- symbols
- target language
- technology
- television
- text
- text corpus
- text database
- toolkit
- training
- training corpus
- training data
- training material
- training set
- training text
- transcribed speech
- transformation
- triphone
- unigram
- vocabulary
- vocabulary growth
- word
- word error rate
- word fragments
- word order
- word sequence
- words