Darts: An Introduction to Probability with calculus

Darts and Averages: An Introduction to the The Fundamental Theorem of Calculus through Probability

We begin by throwing some darts at a unit circle dartboard.

We'll keep track of a random variable R , which measures the distance the dart lands from the center.

The Key Question: What do you think the average value of R will be?

For now let's return to the simple experiment.

With the darts falling at random anywhere in the circle it should seem reasonable that:

The probability the dart falls into any particular region R inside the circle is proportional to ratio of the area A of that region to the area of the unit circle, i.e., `A/pi` For example, with a concentric circle of radius 1/2, the ratio of the area of the circle with radius 1/2 to the unit circle is `{pi/4} / pi = 1/4`.
Sorry, this page requires a Java-compatible web browser.

We generalize and define the probability distribution function F for the random variable R by

F(A) = probability that R `<=` A

In the case of the dart variable R,

	{	0 when A `<=` 0
F(A) =		A² when 0< A <1
		1 when A `>=` 1

The probability that a dart falls in any particular band (called an annulus) formed by concentric circles is also easy to calculate from the areas.

The probability that `A < R <= B` is just `F(B) - F(A)`.

With this analysis it should be clear that the probability that R = A is zero since the circle of radius A is a region in the plane with area zero.

This result can be interpreted as saying that the likelihood of the dart landing on a circle of a given radius is very small. And in an experiment, any specified number A from 0 to 1 is equally likely to occur as the value of R.

Yet the formula above also suggests that the probability that the value of R will lie between `1/8` and `1/4` is not as large as the probability that R will lie between `3/4` and `7/8`.

This leads to the concepts of average probability density and point probability density.

The average probability density for an interval [A,B] is the ratio of the probability that R will fall in a certain interval [A,B] to the length of that interval, B-A. That is,

`barF(A,B) = {F(B)-F(A)}/ {B-A}`.

The densities for the intervals `[1/4,3/8]` and `[3/4,7/8]` illustrate why larger values of R are more likely by measuring the average density of comparable length intervals that contain them.

For the interval `[1/4,3/8]` we have the average density is

`barF(1/4,3/8) = {9/64 - 4/64}/{3/8 - 1/4} = 5/8`;

while for the intervals `[3/4,7/8]` the average density is

`barF(3/4,7/8) = {49/64 - 36/64}/{7/8 - 3/4} = 13/8`

The point probability density of the random variable R at the point A, dF(A), is the limit as `B rarr A` of the average probability densities for intervals with endpoints A and B. So

dF(A) =	`lim`	`{F(B) - F(A)}/{B - A}`
	`B rarr A`

= F '(A) = the derivative of the function F at A.

Thus in the case of the darts,

dF(A)= 2A for `0<=A<=1`,
and dF(A) = 0 for all other A.

This is the key relation between the distribution function F and the probability density function of a random variable.

REMARKS on the DENSITY FUNCTION.

`dF(A) >= 0` for all A provided it exists.
DENSITY AND NET CHANGE:

If G is any function with `G'(A) = dF(A)` for all A,
then the probability that `A < R <= B` is just `G(B) - G(A) = F(B) - F(A)`.

So to find the probability that a random variable is between A and B we need only find the net change from A to B in any function that has the density function as its derivative.

Let's return to the key question of finding the average or what is called the MEAN of the random variable:

First let's note that when `B~~ A, F(B) - F(A)~~dF(A)*(B-A)`. This is just the "differential estimate" applied to the distribution function F at A.

Let's cut the interval from 0 to 1 into N pieces of equal length. For example if N = 5 we would have the intervals with length 1/5.

Now when N is large the length of these intervals will get small and there won't be much variation of the value of R in that interval.

To estimate the average from the theory, it would seem sensible to choose one number to represent the numbers from each interval - call it "`r_k` ". Now estimate the probability that a dart would fall in that interval, call that "`p_k`". Then, multiply the representative number by that probability and finally add those numbers up to find an estimate for the average, i.e.,

Average value of R ` ~~ sum_{k=1}^{k=N} r_k*p_k`.

If we choose the left hand endpoint of each interval we would find an underestimate:
Can you see why?

For example when N = 5 we would find

underest `= r_0*p_0 + r_1*p_1 + r_2*p_2 + r_3*p_3 + r_4*p_4 `

`= 0 + 1/5*3/25 + 2/5*5/25 + 3/5*7/25 + 4/5 * 9/25 `
`= {3+10 + 21+36}/125 = 70/125 = 14/25 = .56`

THE MEAN MEETS THE Euler Sums and estimates for Net Change:

Now remember that `p_k = F(A_k) - F(A_{k-1}) ~~ dF(A_{k-1}) *1/N`.
So to estimate the average value of R theoretically we could consider use `r_k = A_{k-1}`

`sum_{k=1}^{k=N} r_k*p_k ~~ sum_{k=1}^{k=N} A_{k-1} *dF(A_{k-1})*1/N`.

AHahhh! this last expression is precisely an Euler Sum that estimates the net change from 0 to 1 in a function S where

`S'(x) = x * dF(x) = x * 2x = 2 x^2`

Thus the MEAN of the random variable R must be `S(1) - S(0)` where

`S(x) = 2{x^3}/3`.
and so the MEAN of R is `2/3` !

Final Comments: For any random variable, X, where X has values between A and B, F is used to denote the distribution function and f is used to denote the density function.

When F and f are continuous functions, the mean of the random variable is
`S(B)-S(A)` where S is any function that has `S'(x) = x f(x) `.

The connection between (Euler) sums and differential equations will be discussed further in the calculus course and is sometimes described as the Fundamental Theorem of Calculus.