ACL RD-TEC 1.0 Summarization of J97-2002

Paper Title:
ADAPTIVE MULTILINGUAL SENTENCE BOUNDARY DISAMBIGUATION

Authors: David D. Palmer and Marti A. Hearst

Primarily assigned technology terms:

Other assigned terms:

  • abbreviation
  • abbreviations
  • adjective
  • adverb
  • aligned corpus
  • ambiguity
  • ambiguous punctuation
  • approach
  • array
  • association for computational linguistics
  • automata
  • baseline system performance
  • binary feature
  • binary features
  • boundary marker
  • break
  • brown corpus
  • capitalization information
  • case
  • character sequence
  • characters
  • classification tree
  • comparative study
  • context size
  • context vectors
  • context words
  • corpora
  • declarative sentence
  • dictionary
  • disambiguation system
  • disambiguation task
  • discourse
  • distribution
  • document
  • ellipsis
  • english lexicon
  • error rate
  • estimation
  • fact
  • feature
  • feature vectors
  • french
  • frequency counts
  • frequency distribution
  • function words
  • genre
  • german text
  • grammar
  • grammar rules
  • grammatical structure
  • heuristic
  • heuristic rules
  • heuristics
  • imperative sentence
  • implementation
  • information content
  • input text
  • knowledge
  • language processing applications
  • language processing tasks
  • leaf
  • lexica
  • lexical research
  • lexicon
  • linguistics
  • linguistics research
  • lookahead
  • mapping
  • meaning
  • measure
  • method
  • modal verb
  • names
  • natural language
  • natural language processing applications
  • natural language processing tasks
  • natural languages
  • news corpus
  • nlp application
  • nlp tasks
  • noise
  • nouns
  • opinion
  • opinions
  • paragraph
  • paragraphs
  • parallel corpora
  • parallel texts
  • part of speech
  • part-of-speech
  • part-of-speech information
  • part-of-speech tags
  • parts of speech
  • past participle
  • preposition
  • prepositions
  • probabilities
  • probability
  • procedure
  • process
  • processing tasks
  • processing time
  • pronoun
  • proper names
  • proper noun
  • punctuation
  • punctuation mark
  • punctuation marks
  • regular expressions
  • representations
  • scalability
  • sentence
  • sentence boundaries
  • sentence boundary
  • sentence level
  • sentences
  • singular noun
  • source text
  • statistics
  • structure of the sentence
  • style
  • subtree
  • suffixes
  • system performance
  • tags
  • technique
  • terms
  • test data
  • test set
  • text
  • text collection
  • text corpus
  • text genre
  • textual unit
  • time expressions
  • tokens
  • training
  • training corpus
  • training data
  • training set
  • training text
  • training time
  • tree
  • trees
  • verb
  • wall street journal corpus
  • word
  • word category
  • word lists
  • word types
  • words
  • wsj corpus

Extracted Section Types:


This page last edited on 10 May 2017.

*** ***