ACL RD-TEC 1.0 Summarization of W05-0706

Paper Title:
CHOOSING AN OPTIMAL ARCHITECTURE FOR SEGMENTATION AND POS-TAGGING OF MODERN HEBREW

Authors: Roy Bar-Haim and Khalil Sima'an and Yoad Winter

Other assigned terms:

  • adjective
  • ambiguity
  • annotated corpora
  • annotated corpus
  • annotated training corpus
  • annotation
  • annotation scheme
  • association for computational linguistics
  • backoff
  • baseline model
  • boundary information
  • case
  • contextual information
  • corpora
  • data sparseness
  • definiteness marker
  • dictionary
  • disambiguation model
  • distribution
  • estimation
  • experimental setting
  • fact
  • feature
  • hebrew text
  • hebrew treebank
  • hypothesis
  • implementation
  • interpolation
  • joint probability
  • knowledge
  • language model
  • language models
  • lemma
  • lexical evidence
  • lexical model
  • linguistic
  • linguistics
  • mapping
  • markov models
  • method
  • modern hebrew
  • modern standard arabic
  • morpheme
  • morpheme level
  • morpheme-level model
  • morphemes
  • morphological features
  • morphological structure
  • ngram
  • ngram language model
  • nonterminal
  • nonterminals
  • nouns
  • occurrence frequency
  • part-of-speech
  • penn treebank
  • phrase
  • pos tag
  • precision
  • prefixes and suffixes
  • prepositions
  • probabilistic framework
  • probabilistic model
  • probabilities
  • probability
  • probability distribution
  • probability distributions
  • probability estimates
  • pronouns
  • punctuation
  • punctuation marks
  • relative frequency
  • seed
  • segmentation accuracy
  • segmentation ambiguity
  • segmented corpus
  • segments
  • semitic languages
  • sentence
  • sentences
  • standard arabic
  • standard deviation
  • statistics
  • stem
  • stems
  • suffix
  • suffixes
  • support vector
  • svms
  • symbol
  • symbols
  • tag set
  • tagged corpora
  • tagged corpus
  • tagging accuracy
  • tags
  • technical assistance
  • term
  • terminals
  • terms
  • test set
  • text
  • textual units
  • training
  • training corpus
  • training data
  • training set
  • treebank
  • unannotated corpora
  • unannotated corpus
  • untagged corpus
  • verb
  • vocabulary
  • word
  • word boundaries
  • word boundary
  • word level
  • word-level model
  • words
  • world knowledge

Extracted Section Types:


This page last edited on 10 May 2017.

*** ***