Statistics 3N03/3J04 - Assignment #2

2003-10-21 (Question 2 revised 2003-10-23)

Due: 2003-10-29 18:00


The numbered problems and data sets are taken from Montgomery & Runger, Applied Statistics and Probability for Engineers, 3rd edition. You will find it easy to do your graphs and calculations in R but, where appropriate, try the calculations on your calculator to check your results.

Part A

Question 1

Graph the probability density function of a binomial distribution with n = 50 and p = 1/6. Superimpose a graph of the approximating normal probability density function. Use vertical bars to show the binomial probabilities and use a smooth line in a different colour for the normal curve. Compute the exact binomial probability of getting 4 or less. Indicate this tail of the distribution by using a different colour for the vertical bars. Compare the exact calculation with the normal approximation, computed with and without the continuity correction.

Question 2

Generate a sample of n = 10 pseudorandom observations from a normal distribution with mean 100 and standard deviation 10. Test for normality graphically by plotting a histogram with the true normal density superimposed and by plotting a probability plot with fitted line, using qqnorm() and qqline(). Repeat for n = 40, 100, 1000. How many observations do you need before you can say with any confidence that the data came from a normal distribution? [Revised 2003-10-23.]

Question 3

Derive and plot the exact distribution for the total score when 3 fair 6-sided dice are rolled independently. Find the exact mean, variance and standard deviation of this distribution.

Use R to simulate 1000 observations from this distribution; compare the sample mean, sample variance, sample standard deviation and histogram with the exact results obtained above. [Hint: Use sample(1:6, 3000, replace = T) to simulate 3000 independent rolls of a fair 6-sided die. Put them in a 1000 x 3 matrix, and use apply() to get the row sums.]

Question 4

Plot the probability density function for a chi-square distribution on 3 degrees of freedom. Look up its mean and variance in the text or in the notes. Demonstrate the Central Limit Theorem by generating 1000 samples, each of size n = 100, from a chi-square distribution on 3 degrees of freedom. Compute the mean of each sample. Display the 1000 sample means on a histogram and on a normal probability plot. Find the mean and standard deviation of their distribution. Are your results consistent with the Central Limit Theorem? [Hint: Same idea as in Q3, use rchisq(100000, 3) to fill a 1000 x 100 matrix with independent chi-square data, then use apply() to find the mean of each row.]

Question 5
A chemical plant produces a variety of products using four different processes; the available labour is sufficient only to run one process at a time. The plant manager knows that the discharge of dangerous pollution into the plant waste water system and thence into a nearby stream is dependent on which process equipment is in operation. The probability that a particular process will be producing dangerous pollution products is as show below:

process A - 40%
process B - 5%
process C - 30%
process D - 10%

All other processes in the plant are considered harmless. In a typical month, the relative likelihoods of processes A, B, C and D operating through the month are 2:4:3:1 respectively.

a) What is the probability that there will be no dangerous pollution discharged in a given month?
b) If dangerous pollution is detected in the plant discharge, what is the probability that process A was operating?
c) The pollution products that are discharged by the various processes have different probabilities of producing a fish kill in the stream that the plant uses for disposal; the probabilities are as follows:

process A - 0.9
process B - 0.1
process C - 0.8
process D - 0.3

Based on these assumptions what is the probability that fish will be killed by pollution in the stream in a given month? Of the four processes, which is the most fruitful one (in terms of minimizing the likelihood of fish kill) to select for clean up if only one can be improved?

Question 6

Steel construction work on multi-storey buildings is a potentially hazardous occupation. A building contractor who is building a skyscraper at a steady pace finds that in spite of a strong emphasis on safety measures, he has been experiencing accidents among his large group of steel workers; on the average, about 1 accident occurs every 6 months.

a) Assuming that the occurrence of a specific accident is not influenced by any previous accidents, find the probability that there will be (exactly) 1 accident in the next 4 months.
b) What is the probability of at least 1 accident in the next 4 months?
c) What is the mean number of accidents that the contractor can expect in a year? What is the standard deviation of the number of accidents during a period of 1 year?
d) If the contractor can go through a year without an accident among his steel construction workers, he will qualify for a safety award. What is the probability of his receiving this award next year?
e) If the contractor’s work is to continue at the same pace over the next 5 years, what is the probability that he will win the safety award twice during this 5-year period?

Part B

3-67 (p 77)
4-47 (p 117 )
4-70 (p 122)
4-152 (p 139)


Statistics 3N03/3J04