S3N03/3J04 Assignment #5

Statistics 3N03/3J04 - Assignment #5

2006-11-24 - Updated 2006-12-12

Use R to do the graphics on this assignment. Do the ANOVA calculations in R and with your calculator, and make sure that both methods agree.

Question 1 [from 2003, 2004 & 2005 Exams Q1]

(a) If you are estimating a variance, how many degrees of freedom do you need for the upper limit of a 99% confidence interval to be less than 4 times the lower limit?

(b) The mean power output of a diesel engine you manufacture is supposed to be 40 kw. From past experience, you know that the output varies between engines, with a standard deviation of 1.3 kw. If you want to test the hypothesis that the mean power output is 40 kw at the 1% level of significance and be 90% certain to detect when the mean is above 40.5 kw or below 39.5 kw, how many engines would you have to test? If you could only afford to test 10 engines, what would be your probability of Type II error? Would the test still be worth doing?

(c) The mean drying time of a certain paint is known to be 12 min. From past experience, you know that the time varies with a standard deviation of 2 min. An additive is supposed to reduce the mean drying time to 11.5 min. If you want the Type I and Type II error rates both to be 5%, how many paint samples would you have to try to test this claim? If you could only afford to test 50 paint samples, what would be your probability of Type II error? Would the test still be worth doing?

(d) A company produces 20% of the windshields for a certain model of car at Plant A and 80% at Plant B. The mean number of flaws (small bubbles) per windshield is 2.1 at Plant A and 4.3 at Plant B. If a given windshield has 3 flaws, what is the probability that it was produced at Plant A? State any assumptions you make.

Question 2 [2005 Exam Q2]

Carry out appropriate analyses for the following two data sets. Give graphs. State any assumptions you make. As far as possible, test each assumption. State your conclusions.

(a) Two microprocessors were compared on a sample of six benchmark codes to determine if there is a difference in speed. Here are the times (in sec) used by each processor on each code.

Code: 1 2 3 4 5 6

Processor A: 27.2 18.1 27.2 19.7 24.5 22.1

Processor B: 24.1 19.3 26.8 20.1 27.6 29.8

(b) The data below give the resilient modulus at 40' C (in 10⁶ kPa) in 7 sections of rutted pavement and 12 sections of non-rutted pavement.

Rutted: 1.48 1.90 1.88 1.29 3.53 2.43 1.00

Nonrutted: 3.06 2.58 1.70 5.76 2.44 2.03 1.76 4.63 2.86 2.82 1.04 5.92

Question 3 [2005 Exam Q3b]

Carry out appropriate analyses for the following data. State any assumptions you make. State your conclusions. Include an appropriate graph and give a 95% confidence interval for the residual variance.

An experiment was conducted to determine the effect of sintering time on the compressive strength of two metals. The results are shown in the following table.

Sintering time: 100 min 200 min

Metal 1 17.1, 16.5, 14.9 19.4, 18.9, 20.1

2 12.3, 13.8, 10.8 15.6, 17.2, 16.7

Question 4 [2005 Exam Q4]

The following data give the load (in lb/ft) at which the first crack in a concrete pipe specimen was observed, and the age (in days) of the specimen. Does load at first crack decrease linearly with age? Present your analysis in an ANOVA table with F-tests for non-linearity and for the slope of the regression line. Graph the data and the fitted line. Give a 99% confidence interval for the residual variance. State your assumptions and your conclusions.

Age: 20 20 20 25 25 25 31 31 31

Load: 11450 10420 11142 10840 11170 10540 9470 9190 9540

Question 5

14-8 (4th edition p. 555, 3rd edition p. 520). [Recall that you did plots for these data in Assignment #2.]

Question 6

In Q2, you are given two data sets; one is to be analysed with a paired t-test, the other with a two-sample t-test.

(a) Analyze the two-sample case as a one-factor ANOVA. Show that the F statistic from the one-factor ANOVA is the square of the t-statistic, hence the ANOVA F-test will give exactly the same P-value as a two-sided t-test.

(b) Analyze the paired-data case as a two-factor ANOVA without replication. Note that becasue there is no replication (n = 1), there are no degress of freedom for the error mean square. The workaraound is to asume that there is no interaction and use the interaction mean square in the denominator of the F-test. Show that the F statistic from the two-factor ANOVA is the square of the paired t-statistic, hence the ANOVA F-test will give exactly the same P-value as a two-sided t-test. If you do this in an application where there really is an interaction, will using the interation mean square as the error mean square make the test more powerful or more conservative?

Code:	1	2	3	4	5	6
Processor A:	27.2	18.1	27.2	19.7	24.5	22.1
Processor B:	24.1	19.3	26.8	20.1	27.6	29.8

Rutted:	1.48	1.90	1.88	1.29	3.53	2.43	1.00
Nonrutted:	3.06	2.58	1.70	5.76	2.44	2.03	1.76	4.63	2.86	2.82	1.04	5.92

Sintering time:		100 min	200 min
Metal	1	17.1, 16.5, 14.9	19.4, 18.9, 20.1
	2	12.3, 13.8, 10.8	15.6, 17.2, 16.7

Age:	20	20	20	25	25	25	31	31	31
Load:	11450	10420	11142	10840	11170	10540	9470	9190	9540