Chapter 6: Probability

Site hosted by Angelfire.com: Build your free website today!

Chapter 6: Probability

Concept of probability forms the foundation of inferential statistics

describes the likelihood of a given event

Basic example: coin-flip

each act of flipping coin is an trial

each unique outcome (heads or tails) is an event

So, probability is describing likelihood of getting heads or tails on each flip

with fair coin, we have two outcomes that are equally likely to occur

In formal terms, the probability of a given event (A) is defined as the number of events favoring event A divided by the total number of possible observations

p(A) = number of observations favoring/satisfying event A

total number of possible observations

Coin example: there are two possible observations (heads or tails). If event A is getting tails, then probability is ½ or .50

Also possible to determine probability of events when measuring more than one variable

Summarize findings in Contingency table

Another useful way to analyze data is in terms of Conditional Probability

likelihood of one event (A) given some other event (B)

p (A/B) = number of observations favoring both event A and event B

number of observations favoring event B

This type of probability analysis is useful to examine the independence of events

is probability of event A higher when event B has occurred?

Another useful way to analyze data is in terms of Joint Probability

likelihood of two events occurring

p (A,B) = number of observations favoring both event A and event B

total number of observations

What does any of this have to do with Statistics? Remember that often we cannot measure an entire population we are interested in studying and instead have to take a random sample of people from that population to investigate. This relates to concept of sampling with or without replacement. Be sure to read this part of the chapter in your text!

We are often interested in answering a specific question or testing a hypothesis.

we might test someone’s understanding of some course material based on the number of questions they can get right on a test
For each question there are two possible outcomes or events (get question right or wrong).
We can talk about the probability of getting a question right as a "success" (p) and the probability of getting a question wrong as a "failure" (q).

If our test has 20 questions, there are 20 trials on which a person can either have a success or a failure. We can calculate the mean score predicted for a person who is just guessing on the questions:

m = np

where n = number of trials and p = probability of success

So, in our example, if a person was just guessing on the questions they should expect to get 10 questions right.

This would relate to the concept of hypothesis testing.

Before giving a person the test, we could assume (hypothesize) that they do not understand the course material. Thus, we could predict an expected result for the study.

In this example, we could predict that the person, who is assumed to not understand the material, would perform at chance level. This would be a NULL Hypothesis.

If our observations (the person’s score on the test) were so discrepant from the predicted result that the difference is unlikely to be due to chance then we would reject the NULL Hypothesis in favor of the competing ALTERNATIVE Hypothesis (that the person does understand the material).

Problem of how to determine chance vs. non-chance results under the assumption that the null hypothesis is true.

This is based on probability value known as Alpha Level. This defines a "non-chance" event. Typically set the alpha level to .05.
So, if we find that the probability of our result, assuming the null hypothesis is true, is less than .05 than this would be a non-chance event and we would reject the null hypothesis in favor of the alternative hypothesis.