- In Excel you should be able to:
- Open an Excel file which is given to you.
- Open data in Excel given to you as a comma separated value file and separate the columns.
- Change or add text to cells.
- Create the following plots and interpret the results. Be able to add x-axis labels, y-axis labels, titles, and legends where appropriate. For histograms be able to set the bin width to a specific number.
- Box and whisker plot
- Histogram
- Bar Plot
- Pie Chart
- Scatter Plot
- Line Plot
- To a scatter plot or line plot add a line of best fit, display the equation of the line of best fit on the chart, and be able to show the \(R^2\) value on the chart. Be able to interpret the slope, y-intercept and \(R^2\) score.
- Perform basic mathematical operations with numbers or data in Excel cells
- Use built-in functions to calculate the following values and be able to interpret the results:
- Mean/Average
- Median
- Mode
- Minimum
- Maximum
- Standard Deviation
- Correlation Score
- Slope
- Y-Intercept
- Quartiles (1st, 2nd, and 3rd)
- Calculate a slope and y-intercept and use these to predict new values in the data set.
- Create a PivotTable and change the values displayed using the rows, columns, and in the table. Be able to change how the values are displayed (i.e. counts, average, minimum, maximum).
- From a PivotTable create a PivotChart that is either a scatter plot or a pie chart.
- In Python you should be able to complete the following tasks without being given a data file:
- Define a variable and determine what type of data it is.
- Print the value stored in a variable.
- Perform basic mathematical operations including addition, subtraction, multiplication, division, and exponentiation.
- Use conditional statements such as
if
, elif
, and else
.
- Create a list and access elements of the list with indices or an index splice. Use the
append
function to add data to the list, the pop
function to remove data from the list, and the len
function to determine the length of a list. Be able to change the value stored at a particular index.
- Create a for loop which can either perform calculations or iterate through a list.
- Define a string and use string functions (
lower
, upper
, etc.) to modify the string. Use indices and index splicing to access portions of the string.
- In Python you should be able to complete the following tasks with a data file:
- Use appropriate import statements to gain access to the following libraries:
- Pandas
- Matplotlib
- Numpy
- Scipy
- Statsmodels
- Import a data file into Python using Pandas and store it as a Pandas Dataframe
- Using a Pandas Dataframe, complete the following tasks:
- Print the Dataframe.
- Print the first five rows of the Dataframe.
- Print the last five rows of the Dataframe.
- Print the names of every column in the Dataframe (without showing any of the data).
- Access a specific row of the Dataframe
- Access a specific column of the Dataframe.
- Add a column to the Dataframe
- Remove a column or row from the Dataframe.
- Print a statistical summary of the Dataframe.
- Print the type of data stored in each column of the Dataframe and the number of non-null values in each column of the Dataframe.
- Fill null values with a given character.
- Remove null values from the Dataframe.
- Create a masks that leaves only certain data remaining in the Dataframe
- Print and interpret the following statistics or values of a column in a Pandas Dataframe:
- Mean/Average
- Median
- Mode
- Minimum
- Maximum
- Standard Deviation
- Quartiles (1st, 2nd, and 3rd)
- Use the following structures to group data in a Dataframe:
- Pivot Table
- Be able to change the way values are displayed in the table (counts, averages, minimums, maximums),
- Be able to change the data displayed on the rows, the columns, and in the table.
- Groupby
- Use and interpret the following structures on a Pandas Dataframe:
- value_counts
- Contingency Table
- Be able to create the following graphs using both Matplotlib and Pandas. Note that some of these graphs may require the use of a pivot table, groupby, or value_counts statement before graphing. Be able to add x-labels, y-labels, titles, legends, and error bars to the plots were appropriate.
- Box and whisker plot
- Histogram
- Bar Plot
- Pie Chart
- Scatter Plot
- Line Plot
- Using either Scipy or Statsmodels be able to calculate a line of best fit for two columns of data. Be able to print the slope, y-intercept, and correlation coefficient from the fit to the console and interpret the values (or from a correlation table). Be able to add the line of best fit to a plot with the data.