Last Updated on September 30, 2022 by David Vause

Text Classification

supervised learning: machines learn from past instances.

training phase: information is gathered, and a model is built.

inference phase: model is applied. unlabeled data.

labeled inputs: labels are known.

model is built: the classification model

unlabeled input -> classification model -> labeled outputs.

Learn a classification model on properties (“features”) and their importance (“weights”) from labeled instances.

  • X: set of attributes or features {x1, x2, …, xn}¬† The input.
  • y: A “class” label from the label set Y={y1, y2, …, yn}

Apply the model to instances to predict the label.

Validation set. The set of the training data that is used to test on.

Classification Paradigms.

  • binary classification: the number of possible classes is two. |Y| = 2
  • multi-class: number of classes is greater than two. |Y| > 2
  • multil-label classification: instances can have two or more labels.

Training phase:

  • what are the features and how do you represent them?
  • What is the classification model or algorithm?
  • What are the model parameters?

Inference phase:

  • What is the expected performance? What is a good measure?

Identifying Features from Text

Types of textual features:

  • words
    • stop words: commonly occurring words
    • normalization: case
    • stemming/lemmatizing: plurals are same as singulars
  • case
    • White House vs. white house
  • parts of speech
    • whether vs. weather
  • grammatical structure, sentence parsing
  • semantics: one feature for a particular group of words
    • {buy, purchase}
    • honorifics, numbers, dates

Naive Bayes Classifiers

  • prior probability:
    • Pr(y=entertainment), Pr(y=CS), Pr(y=zoology)
    • sum equals 1
  • update the likelihood of the class given new information.
  • posterior probability: Pr(y=entertainment|x=’Python’)

Bayes’ Rule

  • \(\text {posterior probability} = \frac{\text{prior probability} \times¬† likelihood} {evidence} \)
  • \( Pr(y|X) = \frac {Pr(y) \times Pr(X|y)} {Pr(X)}\)

Related Posts

Leave a Reply

Your email address will not be published. Required fields are marked *