Ends Sunday! 20% off ALL Part-Time Courses. Enroll Now

Accredited Professional Development Course

Statistics for Data Science

Offered Live Online Only

This course is an introduction to the basic statistical principles often used by data scientists and applied statisticians including:

  • Common statistical issues and how to avoid fallacies.
  • High-level overview of probability and common statistical estimates.
  • Advanced topics like multiple hypothesis testing, independence, sample size and power calculations, and bootstrapping.
  • Statistical programming language R, one of the most popular languages for data science.

Course designed by Greg Ryslik, Vice President of Data Science, Mindstrong Health

Who the course is designed for:

You are a numbers person and math-lover who wants to hone in on statistics skills in order to apply them to data science projects. You wish to prioritize accuracy and avoid fallacies while building towards data science success.

Outcomes

  • An understanding of basic statistical hypothesis testing and confidence intervals.
  • The ability to model data using well known statistical distributions, as well as handle data that is both continuous and categorical.
  • The ability to perform linear regression and adjust for multiple hypothesis.
  • An understanding of how to calculate the number of samples needed to achieve required sensitivity and specificity.
  • An understanding of bootstrapping and Monte Carlo simulation.
Have questions? Get answers to frequently asked questions. FAQs

What you'll receive upon completion:

  • Certificate of completion
  • Certificate link and instructions on how to add to your LinkedIn profile
  • 3.3 Continuing Education Units

Dates & Instructors

Check back soon for our next scheduled course.

Prerequisites

This course is open to beginners, but students should have some experience with coding (Python or R preferable but not required) and have a basic understanding of calculus, linear algebra, and probability. A brief review will be provided but prior experience would be very helpful.

Before the first day of class, students should familiarize themselves with Chapters 1-6 of CK-12 Foundation’s Basic Probability and Statistics – A Short Course. Each chapter should take between 1-2 hours to work through. 

Considering our immersive data science bootcamp?

Professional development alumni can apply the amount of tuition paid for one part-time course towards enrollment in an upcoming bootcamp upon admittance.

Course Structure & Syllabus

Class 1
Basic Probability, Expected Value, Variance, Point Estimates, Introduction to R

Review of basic probability, including how to compute basic properties of a random variable such as the expected value and variance. Clearly define what is a point estimate and how that varies from a statistical estimate. How to compute these properties will be examined via R.

Class 2
Further Probability, Central Limit Theorem, Law of Large Numbers, Hypothesis Testing

Use probability to calculate probabilities about binomial and normal distribution. Explore the central limit theorem and the law of large numbers to understand how to calculate probabilities of events for averages. This will lead into basic hypothesis testing and an exploration of how to interpret testing results.

Class 3
P-Values, Multiple Comparisons, Bonferroni Adjustment

Explore the formal definition of a confidence interval as well as its interpretation. Discuss the issue of multiple comparisons and provide an example of a false positive. Explain the use of a Bonferroni Adjustment as well as the False Discovery Rate.

Class 4
Introduction to Regression, Prediction, Hypothesis Testing for Regression

Given a set of continuous outcomes and predictive variables, create a linear regression model using R. Explain how to use that model to generate predictions for new observations as well as test if any of the coefficients have statistically significant parameters.

Class 5
Model Selection for Regression, Backwards/Forwards, R^2 and other selection criteria

Look at how to select models when using a variety selection criteria such as R^2 and adjusted R^2. Look at backwards, forwards and best subset regression. Briefly cover logistic regression and how/why it’s used.

Class 6
Categorical Data, 2x2 tables, Simpson’s Paradox

Introduce the odds ratio for a 2x2 table as well as a statistical test for independence and introduce 2x2xk table with an example of Simpson’s paradox.

Class 7
Independence, MxN tables and trend, Fisher’s Permutation Test

Go over further examples of independence, along with the introduction of larger tables. Trends and advanced categorical analysis will be covered. Go into Fisher’s exact permutation test to explore what hypothesis testing can be done on small sample sets.

Class 8
Correlation & Causation

Provide several examples of how to calculate correlation for both continuous and categorical variables. Provide how to calculate confidence intervals to determine if the correlation is significant. Explore the correlation implies causation fallacy and provide some counterexamples.

Class 9
A/B testing, Hypothesis Testing proportions, More General Hypothesis

Provide several examples of hypothesis testing as it relates to Data Science and web design. Cover hypothesis testing & confidence intervals for proportions and variance.

Class 10
Sample Size & Power Calculation / Method of Moments Estimation

Work through several examples on how to calculate the required sample size given a specific level of false positives and a pre-specified power level. Go into more detail on why it’s only possible to reject or fail to reject a null hypothesis (and not to accept a null hypothesis). Next, switch gears and cover Method of Moments, compare it to MLE, and take a look at a few examples.

Class 11
Bootstrapping, the Information Matrix & Variance Bound

Discuss some options one can use if dealing with small amounts of data, specifically the bootstrap method. Touch upon the information matrix and how to calculate a theoretical lower bound on the variance of any statistic of interest.

Class 12
Expectation-Maximization Algorithm, Bias/Variance Trade Off

Explore details of the expectation maximization algorithm and how it’s used in the presence of latent variables for estimation. Work through an analytical example as well as how to use R to do it. Cover the Bias/Variance tradeoff when modeling and the pitfalls of overfitting.

Live Online Interactive Learning

Learn from world-class data science practitioners.

Our Live Online instructors bring deep industry experience from a broad range of industries and companies including Viacom, Spotify, and Capital One Labs. You’ll have an Instructor and Assistant Instructor to support you throughout your learning process.

Interact with instructors and classmates in real-time.

This course is truly live, which means you can interact with the instructors and your fellow students in real-time. Stay engaged by asking questions and participating in polls and conversations, and join your course Slack channel for additional support, communication, and collaboration.

Learn online without sacrificing the value of live instruction.

The world is your classroom. Log in from wherever you are and gain access to live, interactive data science instruction that will push your career further in the right direction. In case you have to miss a class, you can access all recordings 24/7 to stay caught up and refer back.

Earn CEUs for accredited courses.

Not only will you walk away with new data science skills and knowledge, you’ll also earn up to 3.3 Continuing Education Units (CEUs). Our courses are accredited by ACCET, who requires we maintain high standards in areas such as quality of instruction and positive student feedback.

Register for an on-demand sample class

Our 1-hour on-demand sample class is a great way to preview what the Live Online experience is like for the Statistics for Data Science professional development course.

Nathan Grossman, instructor of the Statistics for Data Science course, will cover a few introductory basic Python programming topics and mathematical principles.

  1. Brief Introduction to Probability
  2. Law of Large Numbers
  3. Introduction to Random Variables
  4. Followed by Q&A

* By submitting this form you allow Metis to send you awesome updates on events, content, courses & more!

FAQs

Have more questions? No problem. Schedule a chat with admissions