0:09

However, we need something that's a bit easier to work with for numeric outcomes

Â of experiments, where here I'm using the word experiment in a broad sense.

Â So densities and mass functions for random variables are the best starting point for

Â this, and this is all we will need.

Â Probably the most famous example of a density is the so called bell curve.

Â So in this class you'll actually learn what it means to say that

Â the data follow a bell curve.

Â 0:34

You'll actually learn what the bell curve represents.

Â Among other things, you'll know that when you talk about probabilities associated

Â with the bell curve or

Â the normal distribution, you're talking about population quantities.

Â You're not talking about statement about what occurs in the data.

Â How this is going to work and

Â where we're going with this is that we're going to collect data that's going to be

Â used to estimate properties of the population, okay.

Â And that's where we'd like to head.

Â But first before we start working with the data, we need to develop our intuition for

Â how the population quantities work.

Â 1:09

A random variable is the numeric outcome of an experiment.

Â So the random variables that we will study come in two kinds, discrete or continuous.

Â Discrete are ones that you can count, like number of web hits, or the number,

Â the different outcomes that a die can take.

Â Or even things that aren't even numeric, like hair color.

Â And we can assign the numeric value, one for blonde, two for brown, three for

Â black, and so on.

Â 1:35

Continuous random variables can take any value in a continuum.

Â The way we work with discrete random variables is,

Â we're going to assign a probability to every value that they can take.

Â The way that we're going to work with continuous random variables is we're

Â going to assign probabilities to ranges that they can take.

Â 1:58

So the biggest ones for building up our intuition in this class will be

Â the flip of a coin, and we'll either say heads or tails, or 0 or 1.

Â This is discrete random variable, because it could only take two levels.

Â The outcome of the roll of a die is another discrete random variable,

Â because it could only take one of six possible values.

Â These are kind of silly random variables because they're,

Â the probability mechanics is so conceptually simple.

Â Some more complex random variables are given below.

Â For example, the amount of web, the website traffic on a given day,

Â the number of web hits, would be a count random variable.

Â We'll likely treat that as discrete.

Â However we don't really have an upper bound on it.

Â So it's an interesting kind of discrete random variable.

Â We'll use the Poisson distribution likely to model that.

Â 3:00

The hypertension status of a subject randomly drawn from a population could be

Â another random variable.

Â Here we might give a person a one if they have hypertension or

Â were diagnosed with hypertension, and a zero otherwise.

Â So, this random variable would likely be modeled as discrete, or

Â would be modeled as discrete.

Â 3:18

The number of people who click on an ad.

Â Again, this is another discrete random variable, but a, but an unbounded one.

Â But still, we would still assign a problem, probability, for zero clicks,

Â one clicks, two clicks, three clicks and so on.

Â 3:38

When we talked about discrete random variables,

Â we said that the way we're going to work with probability for

Â them is to assign a probability to every value that they can take.

Â So why don't we just call that a function?

Â We'll call it the probability mass function.

Â And this is simply the function that takes any value that a discrete random variable

Â can take, and assigns the probability that it takes that specific value.

Â So a PMF for a die roll would assign one sixth for the value one,

Â one-sixth for the value two, one-sixth for the value three, and so on.

Â But you can come up with rules that a PMF must satisfy in order to then satisfy

Â the basic rules of probability that we outlined at the beginning of the class.

Â First, it must always be larger than zero because we, larger than or

Â equal to zero, because we've already seen that a probability has to

Â be a number between zero and one, inclusive.

Â But then also the sum of the possible values that the random variable can

Â take has to add up to one, just like if I add up the probability a die takes

Â the value one, plus the probability that it takes the value two,

Â plus the probability it takes the value three, plus the value it takes four, five,

Â six, that has to add up to one, otherwise.

Â The probability of something happening, right,

Â that the die takes one of the possible values, would not add up to one,

Â which would violate one of our basic tenets of probability.

Â So all a PMF does, has to satisfy, is these two rules.

Â We won't worry too much about these rules.

Â Instead, we will work with probability mass functions that

Â are particularly useful, like the binomial one, the canonical one for

Â flipping a coin, and the Poisson one, the canonical one for modelling counts.

Â 5:30

Here we're using the notation where an upper case letter represents a potentially

Â unrealized value of the random variable.

Â So it makes sense to talk about the probability that x equals 0 and

Â the probability that capital X equals 1.

Â Where as a lower case x is just a placeholder that we're going to

Â plug a specific number into.

Â So in the PMF down here, we have p x equals one half to the x,

Â one half to the one minus x.

Â So if we plug in x equal to 0 we get one half, and

Â if we plug in x equal to 1 we get one half.

Â And this merely says is that the probability that the random

Â variable takes the value 0 is one half and

Â the probability that the random variable takes the value 1 is also one-half.

Â This is for a fair coin.

Â But what if we add an unfair coin?

Â Let's let theta be the probability of a head and 1 minus theta be

Â the probability of a tail, where theta's some number between 0 and 1.

Â Then we could write our probability math function like this.

Â P of x is theta to the x, 1 minus theta to the 1 minus x.

Â Now, notice if I plug in a lower case x of 1 we get theta, and if I plog in a,

Â plug in a lower case x of 0 I get 1 minus theta.

Â Thus for this population distribution the probability a random variable takes

Â the random 0 is 1 minus theta.

Â And the probability that it takes the value of 1 is theta.

Â This is incredibly useful, for example, for modeling the prevalence of something.

Â For example, if we wanted to model the prevalence of hypertension,

Â we might assume that the population or the sample that we're getting

Â is not unlike flips of biased coin with the success probability theta.

Â