Python Projects Part 2

For this project you will be working with the Titanic data set located on the homework desciption for the second Python homework (titanic_train.csv). This file was pulled from this website. There, you’ll see an explanation for the columns and the entries in some of the columns.

Make sure you can open the file as a dataframe, as in the cell below, and check out the columns.

Then, in your teams, decide on some interesting and practical questions that might be answered by analyzing this data. With your group, answer the question(s) with:

  1. Useful visualizations of the data.
  2. Relevant statistics or models (correlation coefficients, chi-squred tests, linear models, etc.).

Note that for full credit on this assignment, your code and reports must be more thorough than the one submitted for the previous Python project. Make use of the new analysis skills you have gained.

It is also permissable and expected that you’ll come up with some questions that you’re unsure how to answer. Make a note of those questions!

Time permitting, we’ll close class on the final project day with a brief discussion/presentation by each group on their findings.

Don’t be afraid to look things up! scipy and matplotlib documentation is a Google search away. Try to look up the answer before asking.