ACL RD-TEC 1.0 Summarization of W98-1111
Paper Title:
LANGUAGE IDENTIFICATION WITH CONFIDENCE LIMITS
LANGUAGE IDENTIFICATION WITH CONFIDENCE LIMITS
Primarily assigned technology terms:
- algorithm
- approximation
- character recognition
- classification
- classification algorithm
- coding
- computing
- discriminant analysis
- encoding
- genre identification
- identification
- language identification
- linear discriminant
- linear discriminant analysis
- measuring
- nlp
- optical character recognition
- part of speech tagging
- processing
- quantitative evaluation
- recognition
- scoring
- search
- speech tagging
- splitting
- statistical approaches
- statistical classification
- statistical classification algorithm
- statistical technique
- tagging
- token identification
Other assigned terms:
- approach
- binomial distribution
- break
- brown corpus
- case
- characters
- clusters
- coding scheme
- confidence measure
- confusion matrix
- convergence
- corpora
- croatian
- data set
- data sets
- density function
- distribution
- entropy
- events
- fact
- function words
- genre
- identification task
- implementation
- information content
- knowledge
- language model
- measure
- measures
- n-grams
- noisy input
- norwegian
- paragraph
- part of speech
- patent
- preprocessor
- priori
- probabilities
- probability
- probability density
- probability density function
- procedure
- process
- punctuation
- search space
- sentence
- serbian
- standard deviation
- statistical information
- statistical model
- statistics
- technique
- terms
- test data
- tokens
- training
- training data
- training set
- word
- words