Syllabus

DSC 140: Fundamentals of Data Science

Instructor Information

  • Dr. Julie Butler
  • Email: butlerju@mountunion.edu
  • Cell Phone: 864-993-7133
  • Office: Bracy 107
  • Office Hours: Monday 12:30 pm - 2:00 pm, Wednesday 2:30 pm - 4:30 pm, Thursdays 3:30pm - 5:00pm, Friday 12:30 pm - 2:00 pm, and by appointment

Course Description

This course provides an introduction to data science, including importing and exporting data, data cleaning and preparation, visualization and statistical analysis, and basic programming skills in programming languages commonly used in the data science field. It will also explore strengths and limitations of modern data analysis, and some of the ethical issues that emerge from the use of data science. Students will gain a general understanding of data science terms, approaches to problems, and strategies for effectively using data science. 4 Semester Hours.

Data is ubiquitous, and the ability to turn data into meaningful information that leads to actionable insights has become an in-demand skillset across a wide range of fields. Some examples include business (e.g., marketing and financial forecasting), insurance (e.g., risk assessment), bioinformatics (e.g., genetic sequencing), politics (e.g., voter data), epidemiology (e.g., assessing and predicting the spread of COVID-19) and physics (e.g., analysis of astrophysical data relating to gravitational waves). Indeed, the field is so in-demand that data science has been described as the “best job” in the U.S. based on salary, the number of job openings, and strong job satisfaction. Even if you are uninterested in working as a data scientist per se, having a working knowledge of the field will likely be useful in your career – and help you to stand out on the job market.

This course is designed to introduce you to the field of data science and analytics. As you might expect, this field pulls heavily from computer science (suppose your employer has a huge database of customer records: how do you interface with it?) and statistics (once you can interface with the database, how do you learn anything from it?), and data science can be viewed as a sort of applied hybrid of both fields.

In this course, we will cover several popular tools used by data scientists, with an emphasis on both how to use a tool or analytical technique and why (or when) to use it. Our primary tool at the beginning will be Microsoft Excel, a very popular program that offers the convenience of making it easy to see your data. From there we will discuss two free and open programming languages, Python and R, that are in widespread use in data science. We will see that these languages offer great power and flexibility in analyzing data, for instance in the context of machine learning. We will close with a discussion of MySQL, a popular framework for interfacing with very large data sets; we will see how to pull data from a MySQL database into Python and R. The emphasis throughout will be on acquiring practical and hands-on expertise in the field. Hopefully you agree that this course supports the Mission of the University of Mount Union: The mission of the University of Mount Union is to prepare students for fulfilling lives, meaningful work, and responsible citizenship!

Learning Goals

DSC 140S is a Social Sciences Foundations (“S”) course in the University’s Integrative Core (IC). The IC has the following “essential learning outcomes”:

  • Critical Thinking: The ability to read, interpret, and analyze information from multiple disciplinary areas and/or perspectives.
  • Written Communication: Demonstrate effective and rhetorically aware writing in appropriate genres for multiple disciplines, for a variety of audiences.
  • Oral Communication: Demonstrate effective and rhetorically aware speaking in appropriate genres for multiple disciplines, for a variety of audiences.
  • Reflective Learning: Demonstrates an awareness of self in relation to the university mission.
  • Complex Problem Solving: Collaborates effectively with peers and others to solve disciplinary and interdisciplinary problems.

In addition, the Social Sciences Foundations courses carry the following common learning outcomes:

  1. Demonstrate effective use of skills in accessing and evaluating information in a social science discipline;
  2. Demonstrate understanding of and ability to use concepts, theories, and methodologies in a social science discipline;
  3. Demonstrate understanding of the ways in which social scientists in a given discipline may view the world similarly to and differently from individuals in other disciplines;
  4. Demonstrate applications of concepts, theories, and methodologies in a social science discipline in a variety of personal and cultural contexts;
  5. Demonstrate awareness of ways in which knowledge and appreciation of one’s own and other cultures, as embedded in a global context, are integrated into study in a given social science discipline

As a Foundations course, DSC 140S places special emphasis on writing and speaking: to be an effective data scientist, you must be able to communicate the fruits of your efforts to colleagues, supervisors (who may know little about the field), and the general audience (who usually won’t)! In this course you will have multiple opportunities to write, receive feedback, and revise your writing; moreover, you will give two separate oral presentations to the instructor and your peers.

The specific learning outcomes for DSC 140S are listed below:

  1. Demonstrate the ability to recognize categories of data (e.g. continuous vs. categorical) and identify analytical questions relevant to a particular set of data
  2. Demonstrate basic proficiency in programming languages for data science
  3. Demonstrate the ability to import and export data in common programming languages for data science
  4. Demonstrate the ability to clean data
  5. Demonstrate an understanding of, and ability to apply, common statistical tests to describe relationships between variables
  6. Demonstrate the ability to generate and customize informative visuals to describe, explore, and communicate insights from data
  7. Demonstrate the ability identify and apply machine learning algorithms in the context of a given analytical problem
  8. Demonstrate appreciation of ethical issues related to data science
  9. Communicate the results of data analysis orally, visually, and in writing to a variety of audiences

Specific learning goals will also be given for each unit to fine tune the goals listed above.

Textbooks and Other Resources

The textbook for this course is A Hands-On Introduction to Data Science by Chirag Shah, which is avaliable at the bookstore. Other readings will be provided throughout the course by the instructor. All software used in this course is either open-source (free for users) or provided through your Mount Union account, including Microsoft Excel, Python, R, and MySQL.

Grading

Your final grade in the course is made up of the following components:

  • Homeworks (6 in total): 25%
  • Exit Tickets: 5%
  • Ethics Report: 10%
  • Project Report #1: 10%
  • Project Report #2: 10%
  • Final Project: 20%
  • Midterm: 10%
  • Final Exam: 10%

Percentage grades can be converted to an A-B scale using the following:

  • A: 100-93
  • A-: 92-90
  • B+: 89-87
  • B: 86-84
  • B-: 83-80
  • C+: 79-77
  • C: 76-74
  • C-: 73-70
  • D+: 69-67
  • D: 66-64
  • D-: 63-60
  • F: 59 and below

Homework

Six homeworks will be provided throughout the semester in order to reinforce the skills learned in lecture. The homeworks will contain a mix of implementation questions (writing code to solve problems) and explanation questions (explain in words how you solved a problem). Implementation questions will be submitted in the form of an Excel notebook (for the first homework) or a Jupyter notebook (for the remaining homeworks). Explanation questions should be submitted as a PDF file.

If you use any resources outside of those provided in the course to complete the homeworks (including AI chat bots) you must provide a citation. Note that you are allowed to use any outside source you desire to complete the homeworks (including AI chat bots) BUT you are not allowed to directly copy any material from these sources. All homeworks may be subject to an oral examination upon submission with points deducted for an inability to explain the submission.

Homework Submission and Late Policy

Homeworks should be submitted to the appropriate dropbox on D2L by the posted due date. Late assignments are accepted but due to a penalty: homeworks which are less than 24 hours late will recieve a 10% deduction, homeworks which are more than 24 hours late but less than 48 hours late will recieve a 20% deduction, homeworks which are more than 48 hours late but less than 72 hours late will be subject to a 30% deduction. Solutions to the homeworks will be posted 72 hours after the posted due date and late submission are not accepted after the solutions are posted.

Exit Tickets

An exit ticket will be given to you at the start of each class period, and due by the end of the class period. The exit ticket will contain 2-4 questions which cover the main points of the day’s lecture or activity. The questions should be easy to answer if you are paying attention to the lecture and following along with th examples. These exit tickets serve three purposes. First, the serve to show you attended the lecture as I will be using the exit tickets to take attendance. Second, the help me ensure that the main points I am trying to convey during the lecture or activity are clearly coming across. Finally, on the back of the exit tickets, you are encouraged to write any questions or comments you have on the lecture and the material covered. This means that I can get daily feedback on the course which I can use to adjust what and how I teach to best suit the needs of the class. It also allows me to go over any common questions the during the next class period.

Ethics Report

Ethics is a large part of data science, from ethical ways to collect data sets to the ways data can be used. The ethics report, your first large project of the semester, will have you explore a topics related to ethics in data science through both a written report and a recorded oral presentation. More details of the ethics report can be found here.

If you use any resources outside of those provided in the course to complete the report (including AI chat bots) you must provide a citation. Note that you are allowed to use any outside source you desire to complete the report(including AI chat bots) BUT you are not allowed to directly copy any material from these sources. Your work may be subject to an oral examination upon submission with points deducted for an inability to explain the submission.

Project Reports

The project reports are a chance for you to explore a larger data science project, using the tools you learn in class and by completing the in-class projects and homework assignments. The project reports also give you an opportunity to practice your written and oral communications skills (the first project report will include a written report and the second project report will include a recorded oral report). More information about these projects can be found here

If you use any resources outside of those provided in the course to complete the report (including AI chat bots) you must provide a citation. Note that you are allowed to use any outside source you desire to complete the report(including AI chat bots) BUT you are not allowed to directly copy any material from these sources. Your work may be subject to an oral examination upon submission with points deducted for an inability to explain the submission.

Final Project

The final project is a chance to explore a topics that is of interest to you through the lens of data sciece. You will choose your own data set for this project and analysis it using the tools developed throughout the course. The final project will include an in-person oral presentation during the last week of the semester and a written report, due by midnight on the last day of classes. More details on the final project can be found here.

If you use any resources outside of those provided in the course to complete the report (including AI chat bots) you must provide a citation. Note that you are allowed to use any outside source you desire to complete the report(including AI chat bots) BUT you are not allowed to directly copy any material from these sources. Your work may be subject to an oral examination upon submission with points deducted for an inability to explain the submission.

Course Policies

Student Expectations

All students are expected to come to class ready to learn and help contribute to an environment that allows other students to learn. This means arriving on time and participating in lectures, not creating distractions for other students, and being courteous to students and the professor. It is expected that you completed all graded assignments and submit them on D2L by the posted deadline unless you are using an extension as detailed in the late policy.

Attendance Policy

Attendance at lectures is not required . If you choose not to attend a lecture you are still responsible for the material and assignments covered during that class. However, please note that attendance is a component of your participation grade as you do need to be present to contribute to class discussion and in-class coding assignments. If you experience an extended absence due to illness or family emergency, please email me, and we can work out a solution.

Accessibility

The University of Mount Union values disability as an important aspect of diversity and is committed to providing equitable access to learning opportunities for all students. Student Accessibility Services (SAS) is the campus office that collaborates with students with disabilities to provide and/or arrange reasonable accommodations based on appropriate documentation, the nature of the request, and feasibility. If you have, or think you have, a temporary or permanent disability and/or a medical diagnosis in any area, such as physical or mental health, attention, learning, chronic health, or sensory, please contact SAS. The SAS office will confidentially discuss your needs, review your documentation, and determine your eligibility for reasonable accommodations. Accommodations are not retroactive, and the instructor is not obligated to provide accommodations if a student does not request accommodation or provide documentation. Students should contact SAS to request accommodations and discuss them with their instructor as early as possible in the semester. You may contact the SAS office by phone at (330) 823-7372; or via e-mail at studentaccessibility@mountunion.edu.

Academic Honesty

All work you submit with your name on it is expected to be original work. You can consult any outside source, including the internet and AI chats, for help on homeworks, but you are not allowed to copy any solutions you find there directly. Additionally, you should be able to thoroughly explain how you arrived at your answer for all work you turn in; all assignments are subject to possible verbal discussion which can contribute to your grade. If you work closely with other classmates on an assignment, please indicate that the solution results from collaboration and list the names of all students who contributed (this is allowed and encouraged). If it can be proven that you used Chegg, ChatGTP, or another person to solve your homework without citations (i.e., you are copying your solutions directly from these sources or others), you will receive a zero for the assignment and be reported for academic dishonesty.

Technology in the Classroom

All electronic devices are allowed in the classroom, provided that you do not use them to distract other students. You are required to bring a laptop which can assess the internet to every class. All devices should be muted and notifications silenced for the class duration. If a device distracts other students, you will be asked to put the device away or leave the classroom.

Communications with the Professor

The best way to ask a question about an assignment is to email me during business hours or text me outside of business hours.

Group Work Policy

All in-class assignments and homework assignments can be completed with other classmates. Each student needs to turn in their own assignment with the names of all collaborators on the assignment. Turning in an assignment that was completed as a group effort with only your name on it is considered cheating (see the above section on academic dishonesty).

Integrative Core Office

The Integrative Core Office (IC Office) helps with FYS courses, Foundations (HANS) courses, the second-year Raider Foundations Portfolio (RFP), Explorations (VG) courses, and IC Capstone courses. The IC Office has an open-door policy that does not require appointments for individual consultations, but both in-person and virtual appointments are available. Stop in KHIC 233, call 330.829.8229, or email icore@mountunion.edu.