We begin by throwing some darts
at a unit circle dartboard.
We'll keep track of a random variable R , which measures the
distance the dart lands from the center.
The Key Question:
What do you think the average value of R will be?
For now let's return to the simple experiment.
With the darts falling at random anywhere in the circle it should seem reasonable that:
We generalize and define the probability distribution function F for the random variable R by
In the case of the dart variable R,
{ |
0 when A `<=` 0 |
|
F(A) = |
A2 when 0< A <1 |
|
1 when A `>=` 1 |
The probability that a dart falls in any particular band (called an annulus) formed by concentric circles is also easy to calculate from the areas.
The probability that `A < R <= B` is just `F(B) - F(A)`.
With this analysis it should be clear that the probability that R = A is zero since the circle of radius A is a region in the plane with area zero.
This result can be interpreted as saying that the likelihood of the dart landing on a circle of a given radius is very small. And in an experiment, any specified number A from 0 to 1 is equally likely to occur as the value of R.
Yet the formula above also suggests that the probability that the value of R will lie between `1/8` and `1/4` is not as large as the probability that R will lie between `3/4` and `7/8`.
This leads to the concepts of average probability density and point probability density.
The average probability density for an interval [A,B] is the ratio of the probability that R will fall in a certain interval [A,B] to the length of that interval, B-A. That is,
The densities for the intervals `[1/4,3/8]` and `[3/4,7/8]` illustrate why larger values of R are more likely by measuring the average density of comparable length intervals that contain them.
For the interval `[1/4,3/8]` we have the average density is
while for the intervals `[3/4,7/8]` the average density is
The point probability density of the random variable
R at the point A, dF(A), is the limit as `B rarr A` of the average probability
densities for intervals with endpoints A and B. So
dF(A) = |
|
|
|
This is the key relation between the distribution function
F and the probability density function of a random variable.
REMARKS on the DENSITY FUNCTION.
So to find the probability
that a random variable
is between A and B we need only find the net change from A to B in any
function that has the density function as its derivative.
Let's return to the key question of finding the average or what is called the MEAN of the random variable:
First let's note that when
`B~~ A, F(B) - F(A)~~dF(A)*(B-A)`. This is just the "differential
estimate" applied to the distribution function F at A.
Let's cut the interval from 0 to 1 into N pieces of equal length. For example if N = 5 we would have the intervals with length 1/5.
Now when N is large the length of these intervals will get small and there won't be much variation of the value of R in that interval.
To estimate the average from the theory, it would seem
sensible
to choose one number to represent the numbers from each interval - call it
"`r_k` ". Now estimate the probability that a dart would fall in that interval,
call that "`p_k`". Then, multiply the representative number by that probability
and finally add those numbers up to find an estimate for the average, i.e.,
If we choose the left hand endpoint of each interval we would find an underestimate:
Can you
see why?
For example when N = 5 we would find
underest `= r_0*p_0 + r_1*p_1 + r_2*p_2 + r_3*p_3 + r_4*p_4 ``= 0 + 1/5*3/25 + 2/5*5/25 + 3/5*7/25 + 4/5 * 9/25 `
THE MEAN MEETS THE Euler Sums and estimates for Net Change:
Now remember that `p_k = F(A_k) - F(A_{k-1}) ~~ dF(A_{k-1}) *1/N`.
So to estimate the average value of R theoretically we could consider use `r_k = A_{k-1}`
`sum_{k=1}^{k=N} r_k*p_k ~~ sum_{k=1}^{k=N} A_{k-1} *dF(A_{k-1})*1/N`.
AHahhh! this last expression is precisely an Euler Sum that estimates the net change from 0 to 1 in a function S where
`S'(x) = x * dF(x) = x * 2x = 2 x^2`
Thus the MEAN of the random variable R must be `S(1) - S(0)` where
`S(x) = 2{x^3}/3`.
and so the MEAN of R is `2/3` !