ACL RD-TEC 1.0 Summarization of J97-2002
Paper Title:
ADAPTIVE MULTILINGUAL SENTENCE BOUNDARY DISAMBIGUATION
ADAPTIVE MULTILINGUAL SENTENCE BOUNDARY DISAMBIGUATION
Authors: David D. Palmer and Marti A. Hearst
Primarily assigned technology terms:
- algorithm
- alignment algorithm
- artificial neural networks
- back-propagation
- back-propagation algorithm
- boundary disambiguation
- boundary disambiguation module
- boundary recognition
- boundary-determination
- capitalization
- character recognition
- classification
- computational linguistics
- cross-validation
- decision tree
- decision tree induction
- decision tree learning
- decision trees
- decomposition
- disambiguation
- discourse analysis
- expression approach
- extraction system
- finite automata
- hybrid disambiguation
- identification
- induction
- induction algorithm
- information extraction
- information extraction system
- language processing
- language processing systems
- learning
- learning algorithm
- learning algorithms
- learning method
- learning methods
- learning procedure
- lexical analysis
- lexical lookup
- machine learning
- machine learning algorithm
- machine-learning
- mainframe computer
- morphological analysis
- multilingual sentence alignment
- natural language processing
- natural language processing systems
- network training
- neural net
- neural network
- neural networks
- nlp
- nlp systems
- optical character recognition
- parameter estimation
- parsers
- parsing
- part-of-speech assignment
- part-of-speech tagger
- part-of-speech tagging
- pattern-recognition
- preprocessing
- processing
- pruning
- recognition
- recognition system
- recognizer
- regression
- regression trees
- regular expression
- segmentation
- sentence alignment
- sentence alignment program
- sentence analysis
- sentence boundary disambiguation
- sentence boundary recognition
- speech recognition
- tagger
- taggers
- tagging
- text preprocessing
- tokenization
- tokenizer
- training algorithm
- training procedure
- tree induction
- tree induction algorithm
- tree learning
- word category prediction
Other assigned terms:
- abbreviation
- abbreviations
- adjective
- adverb
- aligned corpus
- ambiguity
- ambiguous punctuation
- approach
- array
- association for computational linguistics
- automata
- baseline system performance
- binary feature
- binary features
- boundary marker
- break
- brown corpus
- capitalization information
- case
- character sequence
- characters
- classification tree
- comparative study
- context size
- context vectors
- context words
- corpora
- declarative sentence
- dictionary
- disambiguation system
- disambiguation task
- discourse
- distribution
- document
- ellipsis
- english lexicon
- error rate
- estimation
- fact
- feature
- feature vectors
- french
- frequency counts
- frequency distribution
- function words
- genre
- german text
- grammar
- grammar rules
- grammatical structure
- heuristic
- heuristic rules
- heuristics
- imperative sentence
- implementation
- information content
- input text
- knowledge
- language processing applications
- language processing tasks
- leaf
- lexica
- lexical research
- lexicon
- linguistics
- linguistics research
- lookahead
- mapping
- meaning
- measure
- method
- modal verb
- names
- natural language
- natural language processing applications
- natural language processing tasks
- natural languages
- news corpus
- nlp application
- nlp tasks
- noise
- nouns
- opinion
- opinions
- paragraph
- paragraphs
- parallel corpora
- parallel texts
- part of speech
- part-of-speech
- part-of-speech information
- part-of-speech tags
- parts of speech
- past participle
- preposition
- prepositions
- probabilities
- probability
- procedure
- process
- processing tasks
- processing time
- pronoun
- proper names
- proper noun
- punctuation
- punctuation mark
- punctuation marks
- regular expressions
- representations
- scalability
- sentence
- sentence boundaries
- sentence boundary
- sentence level
- sentences
- singular noun
- source text
- statistics
- structure of the sentence
- style
- subtree
- suffixes
- system performance
- tags
- technique
- terms
- test data
- test set
- text
- text collection
- text corpus
- text genre
- textual unit
- time expressions
- tokens
- training
- training corpus
- training data
- training set
- training text
- training time
- tree
- trees
- verb
- wall street journal corpus
- word
- word category
- word lists
- word types
- words
- wsj corpus