Data Science
Projected Time
Prerequisites
No Prerequisites.
Motivation
Data science is a sexy job. The salaries are high, the work is interesting, and there’s significant prestige that comes with the title.
A data scientist will:
Objectives
Participants will be able to:
- Create a Jupyter Notebook to begin data analysis
- Perform exploratory data analysis (EDA)
- Understand the purpose and methods of cleaning data
- Understand the methods of analyzing a dataset
Specific Things to Learn
- Accessing Jupyter Notebooks
- Importing libraries such as pandas and NumPy into Jupyter Notebooks
- Techniques for exploratory data analysis (EDA)
- Identifying missing or erroneous data for possible cleaning
- Using pandas and NumPy to analyze a dataset
Materials
Lesson
Data science is a multi-disciplinary field that uses scientific methods, processes, algorithms and systems to extract knowledge and insights from data in various forms, both structured and unstructured, similar to data mining.
LifeCycle of Data Science
- Tools like Pandas, Numpy, Hadoop, Spark etc comprise an important part of the data science toolbox. It is up to the data scientist to figure out which tool to use in different circumstances (as well as how to use the tool correctly) in order to solve analytically open-ended problems.
Common Mistakes / Misconceptions
- Access to More Data Translates to Higher Accuracy
- Data Science and Business Intelligence Are the Same
- You Must Have Access to Lots of Data
Guided Practice
Independent Practice
Check for Understanding
Form small groups and discuss:
Supplemental Materials