# Making Statistics fun *tolerable*

# Chris is a Stanford-educated tutor with over 10 years experience tutoring Statistics to students of all abilities, from students struggling to get from a C to a B, to go-getters trying to move an A- up to an A, to struggling students just hoping to pass. In that time he got a lot of experience learning how to explain this stuff in a way it actually makes sense to non-math people. Through his videos he has helped countless students, and he can do the same for you.

# Statistics

# Stats Basics:

#### This chapter covers lots of vocab you'll see early in your stats class: sample vs population vs individual; descriptive vs inferential statistics; parameter vs statistic; quantitative vs qualitative vs categorical data; continuous vs discrete; the normal distribution; outliers; reliability vs validity; variability vs dispersion; statistical significance & confidence interval; correlation vs causation; census vs sample.

#### This chapter covers the main types of variables you'll see in stats: explanatory variables, response variables, lurking variables, and confounding variables.

#### Observational study, designed experiment, prospective (longitudinal or cohort) vs cross-sectional vs retrospective studies, double-blind, placebo-controlled study.

#### Lots of sampling methods: simple random sampling, convenience sampling, sampling with and without replacement, stratified vs cluster sampling, and systematic sampling!

#### This chapter covers the "levels of measurement" -- ratio, interval, ordinal and nominal. They don't make a whole lot of sense when you first learn them, but after this video you'll hopefully see that if you understand the ratio level first, the others fall into place.

#### In statistics, you're taking a sample in order to find something out about the population. These videos cover the various ways that either a sample is not representative of the population, or the sample itself

*is*representative yet the data you get from the sample isn't accurate to the sample (thus not to the population either). Random sampling error, Nonrandom sampling error (non-random), Nonsampling error (non-sampling).#### The empirical rule and Chebyshev's theorem are just a couple of little rules of thumb which tell you some vague things about a distribution. You'll never see these again after the first test!

#### The videos in this chapter introduce the basics of what distribution graphs are and what you can figure out from them (hint: area=probability), then applies that knowledge to a bunch of situations, including bi-modal distributions, uniform distributions, kurtosis, skewness, etc.

# Graphing & Charting Data:

#### These are those tables where the data are sorted into ranges and then tallied. While these most definitely *look* like tables, some teachers call them distributions. These videos cover standard frequency tables, as well as relative and cumulative versions, and vocab such as upper and lower class limits, class midpoints, class boundaries, and class widths.

#### When a book or teacher refers to a frequency distribution, usually they're referring to the graphical representation of that distribution, which may have a "bell" or "normal" shape. Technically, that graph is a histogram, so this chapter covers how to produce those histograms, both by hand and on your calculator.

#### No matter which calculator you use, these videos are going to save your bacon, and/or set you off on the right foot. Especially awesome is the discussion of how to set your window, because in no other area of math are the TI calculators so ridiculous: "Why the heck is my calculator displaying a bar graph where each bar has a width of 1.66666666666663?" You'll never know why, but with these videos it won't matter, because I'll show you how to set up your window to never confuse you again (and do other stats stuff good to).

#### Normal quantile graphs really only have one use: they tell you if your data set is normally distributed or not. This chapter covers how to make them on your calculator, how to interpret them, and also some patterns you can learn to recognize that indicate in what way your data isn't normal.

#### Scatterplots usually come up again in later chapters of statistics class, when it's time to analyze them for correlation constants, curve fits, etc. In these videos we'll just introduce what a scatter plot is, and show you how to graph one on your calculator.

#### This chapter covers a bunch of the quick plots that some classes cover but which you'll never see much after the first chapter of the book. Many of them are ones you've seen since grade school under other names, others will just have you scratching your head as to why they even exist.

# Numerical Data Analysis:

#### This chapter covers all the different numbers that can be used to describe the "center" of the data. Mean (average), median and mode are ones you've seen since middle school, then to that mix we add midrange and a few others and show you how to do them the easy way on your calculator.

#### This chapter covers a couple of vocab words that you'll be expected to know for your statistics class. Will they be on the test? Depends on your teacher, though almost everyone is supposed to know that standard deviation is biased, whatever that means.

#### This chapter covers all those numbers that can describe how spread out or narrow a distribution is: standard deviation, variance and range. It also covers a few of the less concrete ways to estimate those measures when data is incomplete, such as using the range rule of thumb to estimate standard deviation. And with each statistic covered, I show you how to find it on your calculator, and how the numbers are different for samples vs populations.

#### This chapter covers everything you'll need to calculate, understand, and answer questions about z-score, a.k.a. "standard score", a.k.a. "z-value".

#### Percentiles are kind of funky because you've been hearing about them for most of your academic career, yet it turns out calculating them is a bit more difficult. For one thing, there's more than one definition of how to calculate them, and not everyone agrees! And going backwards -- being given a percentile and then having to calculate the cutoff score -- has even more disagreement. As always, ask your teacher, but this chapter covers the most common method I've come across.

#### This chapter covers all you could ever want to know about boxplots, aka box-and-whisker diagrams: How to use IQR to to identify outliers and make them part of the "modified" portion of your modified boxplot; How to do it all on your calculator; how to get your roommates to respect your boundaries (you wish).

# Probability:

#### This chapter covers so much probability vocab! Besides everything on the list, we've got exciting synonyms as well, like: mutually exclusive events, false positives, independent events and dependent events.

#### This chapter covers most of the typical ways to estimate probability for standard situations. Also covered are some of the basic rules that make problems easier, like the fact that all the probabilities in a given situation have to add up to one, and how that helps you calculate them.

#### This chapter is short, but it covers a very important concept for business: expected value. How do you make a decision between two different paths that have different probabilities of success but different payoffs? How do you place a value on website visitors if most of them never buy anything? And how is this different from weighted average?

#### These videos cover contingency tables, a.k.a. two-way tables, where things like survey results are broken out into a grid with totals at the bottom of each column and end of each row. I'll show you how to make them, how to fill them out, how to calculate proportions and probabilities from them, and how to fill in the missing spots on the table. These are so useful, you'll find yourself using them even when you don't have to!

#### This chapter explains how to use the addition rule to calculate the probability of an "or" compound event. Sometimes it's obvious, such as "Calculate the probability of rolling a 3 OR a 6", other times there's an "or" in disguise, such as "Calculate the probability of rolling an even number", which is really "2 or 4 or 6."

#### This chapter covers the multiplication rule for probability in "and" problems, such as the probability of rolling two dice and getting a 5 "and" 6, or flipping two coins and getting a head "and" a tails. (As the name implies, multiplication is involved.) These videos also cover hybrid problems where you need both the addition rule and multiplication rule for different steps, and the 5% "approximation" rule (a.k.a. fudge factor) for treating dependent events as independent.

#### This chapter covers problems where you're supposed to find the probability of one thing "given that" something else has already happened. For example, in a political survey, you may be given survey results in a contingency table and asked, "What is the proportion of respondents who intend to vote for Candidate A given that they said they are Republican."

#### Make

_{n}C_{r}and_{n}P_{r}pay for what they've done by mastering them and using them to execute on your upcoming test. Also in this chapter: brush up for this common SAT question.# Discrete Probability Distributions:

#### Once you've already learned about basic probability, this chapter takes things in a more Stats-heavy direction by introducing the vocab of probability "distributions", and covers the 3 rules that every probability distribution must follow.

#### These videos cover formulas to find the various parameters of a discrete probability distribution when you're given a contingency table of outcomes and probabilities. If you're working with a specific type of distribution -- Binomial, Poisson, etc -- check out those chapters instead.

#### These videos cover everything you'll need to know to do binomial probability distribution problems -- stuff like "If you flip a coin 7 times, what are the chances of getting exactly 5 heads?" The hardest part of these problems is knowing which problems to use the binomial formula on, and then knowing what numbers to plug in where. These formulas all have both p's and q's in them.

#### These videos cover everything you'll need to know to do Poisson probability distribution problems -- problems that deal with events in a time period, like how many customers walk into a Starbucks between 10:00 and 11:00 each morning. Also covered is how to tell these apart from binomial problems.

#### This topic isn't covered by many Stats profs, but if yours covers it, you'll want to see these videos. I cover how to do them, as well as how to spot them on your test.

# Continuous Probability Distributions:

#### This chapter introduces the basics of continuous probability distributions, also known as density curves, and explains the difference between them and the discrete distributions you've been using up to this point in Stats class.

#### These videos cover all those crazy word problems where they give you the mean and standard deviation of a population, then ask you what percentage of that population is shorter or taller or heavier or lighter than some cutoff value. That's what the normal distribution is all about. And of course we'll also cover how to do all this on your calculator.

#### This chapter covers a type of distribution that many classes don't cover -- exponential distributions -- which tells you the chances of an event happening in the immediate future. For example, if someone walks into Starbucks every 2 minutes on average, what are chances of someone walking in within the next 1 minute.

#### Normal quantile graphs really only have one use: they tell you if your data set is normally distributed or not. This chapter covers how to make them on your calculator, how to interpret them, and also some patterns you can learn to recognize that indicate in what way your data isn't normal.

# Sampling Distributions & Confidence Intervals:

#### In statistics, you're taking a sample in order to find something out about the population. These videos cover the various ways that either a sample is not representative of the population, or the sample itself

*is*representative yet the data you get from the sample isn't accurate to the sample (thus not to the population either). Random sampling error, Nonrandom sampling error (non-random), Nonsampling error (non-sampling).#### These videos cover sampling distributions of the mean and proportion, also known as the "distribution of the sample mean" and "distribution of the sample proportion." Lots of examples and vocab, including p-hat (phat), so that you'll be ready for the central limit theorem.

#### This chapter tells you what the central limit theorem says, shows you how to draw those little distributions with all the x-bars and p-hats piled up into normal distributions, and explains the rules to when you're allowed to use this thing. The problems you can do with this thing come in the next chapter.

#### In the previous chapter we took a look at the Central Limit Theorem from a more theoretical viewpoint, looking at distrubutions of p-hats and x-bars. Now we'll get into doing lots of examples of the two types of word problems that use the CLT: they give you a population's mean and standard deviation or proportion, then ask you the probability that a random sample would have a certain average or proportion.

#### Not all classes cover this topic. The basic idea is that some binomial distribution problems -- for example, finding the probability that if you flip a coin 8 times you'll get 5 or more heads -- get really time-consuming for larger numbers of flips (trials). It's the "or more" that gets you. Using z-values makes this type of problem a lot faster and easier!

#### This chapter introduces the concept of confidence intervals, along with the most important concepts you're supposed to understand for multiple choice type questions. Also very importantly, it explains the definition of confidence intervals -- and what they are NOT -- because for some reason every Stats teacher and book seems to make a really big deal about the exact words you use to describe what a confidence interval tells you. Sticklers!

#### The next few chapters each cover a specific type of confidence interval problem. This chapter covers the first type that most books cover, the ones where they tell you the mean of a sample but sigma FOR THE ENTIRE POPULATION and ask you to calculate a confidence interval to estimate the mean of the population. How, you may ask, would you know the standard deviation of an entire population but not the mean? Exactly. These are just a non-realistic type of problems that books use to introduce the concept of confidence intervals.

#### This chapter introduces "Student's t-Distribution", which is kind of like the normal distribution, except it's got t-values instead of z-values. You'll be using t-values tons from here on out, for lots of different types of Student t-Tests, where you use t instead of z for smaller sample sizes and when you don't know the population standard deviation (i.e. real world applications).

#### This chapter covers the type of confidence interval problem where they give you the stats of the sample -- mean and standard deviation -- but you do not know sigma (standard deviation of the population), so you have to use the Student t-distribution to calculate your margin of error and confidence interval.

#### These videos cover problems where you are given a proportion or percentage of a sample which has come quality (p), and you are asked to estimate a confidence interval for the population proportion with that same characteristic.

#### This chapter covers the basics of the Chi-Square Distribution (X

^{2}). Chi-Square gets a lot of play throughout stats, so this chapter just covers the basics of the distribution, its properties, the (X^{2})table and what happens for larger degrees of freedom.#### This chapter covers the type of confidence interval problem where they give you some info about a sample, like its standard deviation, and then you use the Chi-Square Distribution and a couple of formulas to spit out a confidence interval estimate of the standard deviation of the population the sample was taken from.

# Hypothesis Testing:

#### Hypothesis testing is super-confusing for every student, right up until the day that you "get it", at which point it becomes a simple matter of plug-and-chug. This chapter is one you MUST WATCH if you are doing hypothesis testing, because its only purpose is to get you to that magic "I get it" moment sooner rather than later. If you're confused by hypothesis testing, forget everything you heard in class and just watch these videos in order. You'll be glad you did!

#### "Testing claims" is just another way of saying "hypothesis testing". This chapter contains three videos, one each for the three different ways that your teacher may want you to know how to do these problems: P-values, critical z-values, or plugging-and-chugging on your calculator.

#### This chapter contains videos covering each of the three different ways that your teacher may want you to know how to test hypotheses involving sample means: P-values, critical t-values (Student t-Test), or plugging-and-chugging on your calculator. A video also covers how to get a critical t-value from the t-table when you're not allowed to use a calculator.

#### This chapter explains the least-common type of hypothesis test: comparing the standard deviation of a sample to the standard deviation of the population using a Chi-Square test. Since you've already done so many hypothesis test problems about proportions and means, this is a relatively plug-and-chug affair.

#### Testing hypotheses comparing the means of two samples (usually with a Student t-Test). Sounds simple enough, but this topic takes up a lot of space in your book because there are different formulas for each specific situation: two samples where the corresponding population standard deviations are known; two samples where the population standard deviations unknown but assumed equal; population standard deviations unknown and assumed unequal; and paired data (dependent samples). Lots of stuff to cover!

#### These videos cover the very specific type of question where you are testing proportions of samples (such as percentage who like pizza) to see if one sample has a significantly different proportion than the other. As usual with proportions, you'll be seeing a lot of p's, q's, and n's!

#### At this point we've been at this hypothesis testing thing for a while, so you know what's coming: more formulas, more plug-and-chug, yet another table of critical values, and some predictable null and alternative hypotheses!

#### For a while now we've been doing hypothesis tests, usually comparing just two samples. The only thing that's new with ANOVA, or Analysis of Variance, is that it allows you to compare the means of as many samples as you want. And you want to compare lots of samples, right?

#### Goodness-of-fit problems are all about asking yourself the question: "Does the data in this frequency table match a pattern I'm expecting?" It's yet another type of hypothesis test. In the "one-way" versions of these problems, there's basically just one column of numbers. In the "two-way" versions you're dealing with a contingency table ("two-way" table) with two or more columns of data, plus a bunch of rows, so things get trickier, but the key is to try not to think too much and instead put your faith in the plug-and-chug.

# Correlation & Regression:

#### These videos cover the Pearson Coefficient of Linear Correlation, or r-value, which basically tells you if your scatter plot (or plots) of data pairs lies along a line or not. Also covered are hypothesis testing of linear correlation, and calculating r on your calculator.

#### This 9-video section covers everything you could ever need to know about linear regression analysis... In "non-math terms", what is a statistical Model? What the heck is a linear regression? What is the "method of least squares"? Using your calculator to do linear regressions, graph the equations, and make predictions for specific values of x. How to calculate standard error on your calculator. Prediction intervals using linear regression. What is the difference between "prediction" and "extrapolation", and why is my teacher always accusing people of extrapolating? What are residuals, a.k.a. "unexplained variation", and what's that crazy diagram with the lines and triangles that my teacher keeps drawing? What is the coefficient of determination? So much to cover, let's get to it!