MySQL Project

Orders

  1. Load the database using the following command. Inspect the different tables and columns.
connection = pymysql.connect(
host = "dsc140.cmgtlybtlnmh.us-east-2.rds.amazonaws.com",
user = "guest",
password = "UMUdsc140",
db="orders")
  1. Identify and print the names of all the tables in the database. Look at the columns within some of the tables and briefly describe what this database appears to be all about.
  2. Find the sale amount for every order. To do this you’ll need to find how many units were sold in each order and the price per unit for the corresponding item that was sold, then multiply the values together.
  3. Make a histogram of the sale amounts you found in #3.
  4. Generate a pandas dataframe that lists the customerID for all customers served by the employee with EmployeeID “8”. Display the dataframe.
  5. Generate a pandas dataframe that lists the Customer’s Name and their corresponding country of residence, again only for customers served by the employee with EmployeeID “8”. Display the dataframe.

Auto Project

  1. Load the database using the following command. Inspect the different tables and columns.
connection = pymysql.connect(
host = "dsc140.cmgtlybtlnmh.us-east-2.rds.amazonaws.com",
user = "guest",
password = "UMUdsc140",
db="autos")
  1. Load the mpg table as a dataframe
  2. Create a plot of model year vs. mpg. Include the correlation score, slope and y-intercept of the line of best fit and add the line of best fit to the plot. Do the same for weight vs mpg and model year vs weight.
  3. Create a box and whisker plot of mpg separated by country codes.
  4. Use K-Means to determine the optimal number of groupings for the cars. This will require you to look up how to do K-Means in Python.
  5. Use k-nearest neighbors to try to classify the car data by the clusters returned in step 5.