Frequently-Asked Questions - February


"Why do we need a Continuity Correction for the normal approximation to the binomial, but not for the washer-and-pin problem?"

1998-02-18

The binomial distribution is a discrete distribution. That is, a binomial random variable takes integer values. All the probability in the binomial distribution sits in discrete lumps at the integers 0, 1, ..., n.

Use the spreadsheet you wrote for Assignment 1 and look at the Bin(10, 0.5) distribution. The mean is 5 and the variance is 2.5. How does a N(5, 2.5) distribution approximate the Bin(10, 0.5)?

The normal is a continuous distribution so we have to approximate the lumps of probability at the integers by areas under the normal curve. Take P(X = 3), for example. The exact binomial probability is 0.1172.

If we integrate the N(5, 2.5) density from 2 to 3 it is too low everywhere and the result is too small.

F((3-5)/sqrt(2.5)) - F((2-5)/sqrt(2.5)) = F(-1.265) - F(-1.897) = 0.1030 - 0.0289 = 0.0741

If we integrate the N(5, 2.5) density from 3 to 4 it is too high everywhere and the result is too large.

F((4-5)/sqrt(2.5)) - F((3-5)/sqrt(2.5)) = F(-0.6325) - F(-1.265) = 0.2635 - 0.1030 = 0.1605

But if we integrate from 2.5 to 3.5 the result is an much closer approximation.

F((3.5-5)/sqrt(2.5)) - F((2.5-5)/sqrt(2.5)) = F(-0.9487) - F(-1.581) = 0.1714 - 0.0569 = 0.1145

More generally, if we want to approximate the binomial probability P(X <= a), we integrate under the normal density from minus infinity to a+0.5.

If we want to approximate the binomial probability that P(X >= a), we integrate the normal density from a-0.5 to infinity.

In the "washer and pin" problem from the February 13 lecture, the diameter of the pin shaft is assumed to follow a normal distribution, as is the diameter of the washer hole. Hence the difference between the two diameters, L, also follows a normal distribution, N(1, 2) to be exact. There is nothing discrete or integer-valued about this problem. L can take any value over a continuous range. Hence if we want P(L > 0) we integrate the N(1, 2) density from 0 to infinity. The continuity correction is not relevant.


 "How many significant digits should I give in my answer?"

1998-02-02

Because statistical calculations often involve subtracting a large number from another large number, it is a good idea to retain as many digits as possible throughout the calculation. Learn how to keep all the numbers in your calculator and avoid having to reenter intermediate results.

Give probabilities to 3 to 5 decimal places or 3 to 5 significant digits, depending on the context.

Give coefficients (such as the correlation coefficient) to 3 to 5 significant digits.

Give means to 2 or 3 more significant digits than the original data and give standard deviations to as many decimal places as the corresponding means.

Make any other calculations consistent with the above guidelines.


"What material should I study for Test #1?"

1998-02-02

Descriptive Statistics
All of Chapter 2
Probability
All of Chapter 3
Discrete Probability Distributions
All of Chapter 4
Continuous Probability Distributions: Normal Distribution
Chapter 5, Sections 5.1-5.5 only
Study Design
Chapter 10, Section 10.3 only
Correlation Coefficient (calculation, interpretation) and Scatter Plots
Chapter 11, Section 11.10 only

Back to the Statistics 2MA3 Home Page