Week 2: Introduction to Classification

Learning Objectives

Concepts

Without any programming, you should be able to:

  • Explain the difference between a regression problem and a classification problem including differences in the data formatting, training process, loss functions, and evaluation of results.
  • Graphically describe what is happening when linear regression, LASSO regression, logistic regression, or ridge regression or training on categorical data and how classifications are chosen for the test data.
  • Graphically explain how the k-Nearest Neighbors algorithms is trained to perform classification and how the classifications are chosen for the test data.

Implementation

Using the Python programming language, you should be able to:

  • Convert a categorical data set into the form needed for classification including converting categorical data given in words to numeric categories.
  • Implement linear regression and ridge regression for classification and make improvements to the accuracy of the initial training.
  • Implement k-Nearest Neighbors for classification and make improvements to the accuracy of the initial training.