8. Probability & Statistics

Lesson

In order to predict the future, we sometimes need to determine the probability by running experiments, or looking at data that has already been collected. This is called experimental probability, since we determine the probability of each outcome by looking at past events.

Imagine we have a "loaded" die, where a weight is placed inside the die opposite the face that the cheater wants to come up the most (in this case, the $6$6):

If the die is made like this, the probability of each outcome is no longer equal, and we cannot say that the probability of rolling any particular face is $\frac{1}{6}$16.

Instead we will need to roll the die many times and record our results, and use these results to predict the future. Here are the results of an experiment where the die was rolled $200$200 times:

Result | Number of rolls |
---|---|

$1$1 | $11$11 |

$2$2 | $19$19 |

$3$3 | $18$18 |

$4$4 | $18$18 |

$5$5 | $20$20 |

$6$6 | $114$114 |

We can now try to predict the future using this experimental data, and the following formula:

$\text{Experimental probability of event}=\frac{\text{Number of times event occurred in experiments}}{\text{Total number of experiments}}$Experimental probability of event=Number of times event occurred in experimentsTotal number of experiments

Here is the table again, with the experimental probability of each face listed as a percentage:

Result | Number of rolls | Experimental probability |
---|---|---|

$1$1 | $11$11 | $5.5%$5.5% |

$2$2 | $19$19 | $9.5%$9.5% |

$3$3 | $18$18 | $9%$9% |

$4$4 | $18$18 | $9%$9% |

$5$5 | $20$20 | $10%$10% |

$6$6 | $114$114 | $57%$57% |

A normal die has around $17%$17% chance of rolling a $6$6, but this die rolls a $6$6 more than half the time!

Sometimes our "experiments" involve looking at historical data instead. For example, we can't run hundreds of Eurovision Song Contests to test out who would win, so instead we look at past performance when trying to predict the future. The following table shows the winner of the Eurovision Song Contest from 1999 to 2018:

Year | Winning country | Year | Winning country |
---|---|---|---|

1999 | Sweden | 2009 | Norway |

2000 | Denmark | 2010 | Germany |

2001 | Estonia | 2011 | Azerbaijan |

2002 | Latvia | 2012 | Sweden |

2003 | Turkey | 2013 | Denmark |

2004 | Ukraine | 2014 | Austria |

2005 | Greece | 2015 | Sweden |

2006 | Finland | 2016 | Ukraine |

2007 | Serbia | 2017 | Portugal |

2008 | Russia | 2018 | Israel |

What is the experimental probability that Sweden will win the next Eurovision Song Contest?

We think of each contest as an "experiment", and there are $20$20 in total. The winning country is the event, and we can tell that $3$3 of the contests were won by Sweden. So using the same formula as above,

$\text{Experimental probability of event}=\frac{\text{Number of times event occurred in experiments}}{\text{Total number of experiments}}$Experimental probability of event=Number of times event occurred in experimentsTotal number of experiments

the experimental probability is $\frac{3}{20}$320, which is $15%$15%.

How many of the next $50$50 contests can Sweden expect to win?

Just like in the last chapter, we can calculate this by multiplying the experimental probability of an event by the number of trials. In this case Sweden can expect to win

$\frac{3}{20}\times50=\frac{150}{20}$320×50=15020 contests

This rounds to $8$8 contests out of the next $50$50.

Summary

$\text{Experimental probability of event}=\frac{\text{Number of times event occurred in experiments}}{\text{Total number of experiments}}$Experimental probability of event=Number of times event occurred in experimentsTotal number of experiments

You may also see the term relative frequency which is the same as the experimental probability.

A retail store served $773$773 customers in October, and there were $44$44 complaints during that month.

What is the experimental probability that a customer complains?

Give your answer as a percentage, rounded to the nearest whole percent.

An insurance company found that in the past year, of the $2558$2558 claims made, $1493$1493 of them were from drivers under the age of 25.

Give your answers to the following questions as percentages, rounded to the nearest whole percent.

What is the experimental probability that a claim is filed by someone under the age of 25?

What is the experimental probability that a claim is filed by someone 25 or older?

The experimental probability that a commuter uses public transport is $50%$50%.

Out of $500$500 commuters, how many would you expect to use public transport?

Approximate the probability of a chance event by collecting data on the chance process that produces it and observing its long-run relative frequency, and predict the approximate relative frequency given the probability.