ACL RD-TEC 1.0 Summarization of C92-4173
Paper Title:
TOKENIZATION AS THE INITIAL PHASE IN NLP
TOKENIZATION AS THE INITIAL PHASE IN NLP
Authors: Jonathan J. Webster and Chunyu Kit
Primarily assigned technology terms:
- automatic segmentation
- automatic word segmentation
- character coding
- chinese word segmentation
- coding
- computational processing
- computing
- corpus linguistics
- decomposition
- disambiguation
- grouping
- identification
- illustration
- information retrieval
- knowledge processing
- machine translation
- matching
- maximum match
- mechanical segmentation
- minimum match
- mt systems
- neural network
- nlp
- parsing
- pattern recognition
- preprocessing
- processing
- recognition
- search
- segmentation
- standardization
- string matching
- structural analysis
- syntactic analysis
- tagging
- tagging system
- terminology
- text processing
- tile
- token identification
- tokenizafion
- tokenization
- training process
- word identification
- word recognition
- word segmentation
Other assigned terms:
- ambiguity
- approach
- case
- characters
- chinese characters
- chinese word
- chinese words
- cluster
- clusters
- co-occurrence
- collocation
- collocation pattern
- composition
- compounds
- concept
- concepts
- dictionary
- disjunctive ambiguity
- english compound
- fact
- generation
- grammar
- idiom
- implementation
- input text
- knowledge
- knowledge base
- lexemes
- lexical entry
- lexical information
- lexical item
- lexical items
- lexical knowledge
- lexicographer
- lexicography
- lexicon
- linguistic
- linguistic unit
- linguistics
- linguists
- mapping
- meaning
- method
- modifier
- morpheme
- morphemes
- noun phrase
- phrase
- procedure
- process
- statistic
- structural model
- term
- terms
- text
- tokens
- training
- word
- word grammar
- words