ACL RD-TEC 1.0 Summarization of W04-1007
Paper Title:
A RHETORICAL STATUS CLASSIFIER FOR LEGAL TEXT SUMMARISATION
A RHETORICAL STATUS CLASSIFIER FOR LEGAL TEXT SUMMARISATION
Authors: Ben Hachey and Claire Grover
Primarily assigned technology terms:
- active learning
- algorithm
- annotation system
- automatic summarisation
- bayes classifier
- bootstrapping
- boundary disambiguation
- boundary disambiguation module
- boundary identification
- categorisation
- chunking
- classification
- classifier
- classifiers
- clause identification
- co-training
- corpus annotation
- cross validation
- decision trees
- density estimation
- disambiguation
- encoding
- entity recognition
- entity recognition systems
- entity tagger
- entropy modelling
- feature encoding
- identification
- induction
- kernels
- learning
- learning method
- lemmatiser
- lexical look-up
- linguistic analysis
- linguistic processing
- machine learning
- macro-averaging
- maximum entropy
- maximum entropy classifiers
- maximum entropy model
- mistake-driven learning
- modelling
- named entity recognition
- named entity tagger
- nlp
- optimization
- optimization algorithm
- part-of-speech tagging
- processing
- recogniser
- recognition
- recognition systems
- reporting
- rhetorical role classification
- rhetorical status classifier
- role annotation
- role classification
- rule induction
- scoring
- segmentation
- sense disambiguation
- sentence boundary disambiguation
- sentence classifier
- sentence extraction
- sentence selection
- sequence modelling
- smoothing
- status classifier
- summarisation
- summarisation system
- support vector machines
- svm classifier
- tagger
- tagging
- task-based evaluation
- text categorisation
- text extraction
- text summarisation
- tokenisation
- transducer
- validation
- weka
- word sense disambiguation
Other assigned terms:
- 10-fold cross validation
- annotated corpus
- annotation
- annotation scheme
- annotator
- annotators
- approach
- argumentation
- automatic processing
- case
- chunk
- citation
- clause boundary
- clause structure
- clause-level annotation
- coherence
- communicative goals
- cue phrase
- cue phrase information
- cue phrases
- data set
- discourse
- discourse information
- distribution
- document
- document frequency
- english law
- entity recognition component
- entity subtype
- entity type
- entity types
- entropy
- estimation
- events
- experimental results
- f score
- f-score
- fact
- feature
- feature set
- feature sets
- feature type
- finite verb
- free-running text
- generalisation
- gold standard
- grammar
- head noun
- human annotators
- hypernym
- information sources
- inter-annotator agreement
- inverse document frequency
- kappa
- knowledge
- lemma
- lexicon
- linguistic
- linguistic annotation
- linguistic features
- linguistic information
- linguistic knowledge
- main verb
- manual annotation
- mark-up
- markup
- measure
- method
- methodology
- modality
- named entities
- named entity
- names
- natural language
- negation
- noun group
- noun groups
- opinion
- opinions
- paragraph
- paragraphs
- parameter settings
- part-of-speech
- phrase
- portability
- prepositional phrases
- process
- recognition component
- rhetorical status
- rule sets
- seed
- segments
- sentence
- sentence boundary
- sentences
- source text
- structured information
- subcategorisation
- substring
- support vector
- svms
- syntactic features
- system evaluation
- tags
- term
- term frequency
- terms
- text
- tf \* idf
- tokens
- toolkit
- training
- trees
- uniform distribution
- user
- verb
- verb group
- verb groups
- word
- word features
- word sense
- wordnet
- words
- wrapper
- xml document
- xml format