ACL RD-TEC 1.0 Summarization of W05-0706
Paper Title:
CHOOSING AN OPTIMAL ARCHITECTURE FOR SEGMENTATION AND POS-TAGGING OF MODERN HEBREW
CHOOSING AN OPTIMAL ARCHITECTURE FOR SEGMENTATION AND POS-TAGGING OF MODERN HEBREW
Authors: Roy Bar-Haim and Khalil Sima'an and Yoad Winter
Primarily assigned technology terms:
- algorithm
- analyzer
- backoff smoothing
- baum-welch algorithm
- bootstrapping
- bootstrapping method
- chunking
- classification
- computational linguistics
- disambiguation
- error analysis
- estimation algorithm
- estimation method
- grouping
- hidden markov
- hidden markov models
- hmms
- learning
- linear interpolation
- markov model
- maximum-likelihood
- maximum-likelihood estimation
- measuring
- morphological analyzer
- morphological disambiguation
- morphological tagging
- parser
- phrase chunking
- pos tagger
- pos tagging
- pos-tagging
- processing
- segmentation
- segmentation and pos tagging
- segmentation system
- segmenter
- smoothing
- smoothing method
- spelling
- support vector machines
- syntactic parser
- tag assignment
- tag disambiguation
- tagger
- taggers
- tagging
- tagging system
- tokenization
- unsupervised algorithm
- unsupervised estimation
- vocalization
- word segmentation
- word segmentation and pos tagging
- word segmentation system
- word segmenter
- word tagging
- word tokenization
Other assigned terms:
- adjective
- ambiguity
- annotated corpora
- annotated corpus
- annotated training corpus
- annotation
- annotation scheme
- association for computational linguistics
- backoff
- baseline model
- boundary information
- case
- contextual information
- corpora
- data sparseness
- definiteness marker
- dictionary
- disambiguation model
- distribution
- estimation
- experimental setting
- fact
- feature
- hebrew text
- hebrew treebank
- hypothesis
- implementation
- interpolation
- joint probability
- knowledge
- language model
- language models
- lemma
- lexical evidence
- lexical model
- linguistic
- linguistics
- mapping
- markov models
- method
- modern hebrew
- modern standard arabic
- morpheme
- morpheme level
- morpheme-level model
- morphemes
- morphological features
- morphological structure
- ngram
- ngram language model
- nonterminal
- nonterminals
- nouns
- occurrence frequency
- part-of-speech
- penn treebank
- phrase
- pos tag
- precision
- prefixes and suffixes
- prepositions
- probabilistic framework
- probabilistic model
- probabilities
- probability
- probability distribution
- probability distributions
- probability estimates
- pronouns
- punctuation
- punctuation marks
- relative frequency
- seed
- segmentation accuracy
- segmentation ambiguity
- segmented corpus
- segments
- semitic languages
- sentence
- sentences
- standard arabic
- standard deviation
- statistics
- stem
- stems
- suffix
- suffixes
- support vector
- svms
- symbol
- symbols
- tag set
- tagged corpora
- tagged corpus
- tagging accuracy
- tags
- technical assistance
- term
- terminals
- terms
- test set
- text
- textual units
- training
- training corpus
- training data
- training set
- treebank
- unannotated corpora
- unannotated corpus
- untagged corpus
- verb
- vocabulary
- word
- word boundaries
- word boundary
- word level
- word-level model
- words
- world knowledge