Week 1: Introduction to Regression

Learning Objectives

Concepts

Without any programming, you should be able to:

Define machine learning and explain the differences between supervised and unsupervised learning and regression and classification.
Describe what occurs in the six main steps of the machine learning workflow, those being data preprocessing, train-test split, training, testing performance, evaluating performance, and improving the model’s performance.
Explain the main components of every machine learning algorithm, including the input, the output, the trainable parameters, and the loss function.
Describe the following algorithms: linear regression, LASSO regression, logistic regression, and ridge regression. Be able to explain mathematically how each algorithm arrives at its trained parameter and how each of the loss functions differ.
Describe how the following techniques can be applied to each of the above algorithms to improve performance: feature engineering, use of a design matrix, hyperparameter tuning.

Using the Python programming language, you should be able to:

Use the Pandas library to import, analyze, and clean a data file.
Use the Scikit-Learn library to perform a train-test split on a given data set.
Use the Scikit-Learn library to implement the following regression algorithms: linear regression, LASSO regression, logistic regression, and ridge regression.
Write code which will train each of the above algorithms using the training set, test its performance with the training set, evaluate its performance, and improve its performance.
Implement linear regression and ridge regression from scratch using only the Numpy library.