These problems and solutions cover all topics to the end of the course.

One-Sample Tests and Confidence Intervals

- Use the Hospital Stay data in Table 2.11 on p. 39 of Rosner.
(HOSPITAL.DAT, HOSPITAL.DOC on the data disk.) Plot a histogram of
duration of stay. Compute the mean duration of stay. Assuming that
the standard deviation of duration of stay is 4 days, find a
2-sided 95% confidence interval for the mean duration of stay and
find a
*p*-value to test the hypothesis that the mean length of stay is 5 days against the alternative that it is longer than that. Test the hypothesis that the standard deviation is 4 days, with a 2-sided 5% test. Without assuming a value for standard deviation, find a 2-sided 95% confidence interval for the mean duration of stay and find a*p*-value to test the hypothesis that the mean length of stay is 5 days against the alternative that it is longer than that. State your conclusions. Which of the above calculations may be invalid? - The mean concentration of a solution is supposed to be 60%. You suspect that it might be off by as much as 2% in either direction. From past experience, you know that your analytical method has a standard deviation of 2.6%. How many observations would be required to test this hypothesis at the 5% level, and ensure that the Type II Error Rate is no more than 1%?
- The percentage of voters supporting a given political party was 60% at the last poll. You suspect that this might have changed by as much as 2% in either direction. How many voters should you sample to test this hypothesis at the 5% level, and ensure that the Type II Error Rate is no more than 1%? What would you do if the recommended number is larger than you can afford? If you sampled 3000 voters and 1623 were supporters of that party, give a 2-sided 95% confidence interval for percent support.
- How many independent normal observations would be required to ensure that the upper limit of a 99% confidence interval for the variance is no more than 5 times the lower limit?

- Using the nutrition data described in Table 2.16 on p. 42 of
Rosner (VALID.DAT, VALID.DOC on the data disk), compute a
*p*-value to test the hypothesis that the mean alcohol consumption from the food frequency questionnaire is the same as from the diet record. Do what you can to test any assumptions you make. State your conclusions. - Repeat the previous analysis using the sign test (Sect. 9.2, p. 333) to test the hypothesis that the median difference between the two measures is zero. State your assumptions and your conclusions.
- Do the Microbiology example, problems
**8.148-8.152**on p. 326 of Rosner. Give an appropriate graphical display of the data. State any assumptions you make and test any assumptions you can test.

- You have analyzed the Microbiology example, problems
**8.148-8.152**on p. 326 of Rosner, as a two-sample t-test. Repeat the analysis, this time as an analysis of variance for a one-factor design. Show that the F statistic in the anova table is the square of the two-sample t statistic and has the same*p*-value. Show that the mean squared error (also called the mean squared residual or residual variance) is the same as the pooled variance estimate in the t-test. The graphical displays, assumptions and conclusions are exactly the same for both analyses. - Give a 95% confidence interval for the conditional variance of pod weight, given treatment, that is, the residual mean squared error after fitting treatment as a factor.
- Analyze the Obstetrics data on p. 569 of Rosner. Answer
problem
**12.14**with a comparative box plot and an ANOVA table. Give a 95% confidence interval for the residual variance s^{2}.

- To see if the coverage of light blue latex interior paint
depends on either the brand of paint or the brand of roller used,
two gallons of each of three brands of paint were applied using
each of two brands of roller. Present your results in a two-factor
ANOVA table. State your assumptions and your conclusions. Give a
95% confidence interval for the residual variance s
^{2}.

Paint brand:

1

1

1

1

2

2

2

2

3

3

3

3

Roller brand:

1

1

2

2

1

1

2

2

1

1

2

2

Coverage (ft

^{2}):454

460

446

440

446

445

444

449

439

432

442

443

- Use the Hospital Stay data in Table 2.11 on p. 39 of Rosner. (HOSPITAL.DAT, HOSPITAL.DOC on the data disk.) Plot duration of stay (dependent variable) against age (independent variable). Fit a straight line to the data and add it to the graph. Summarize the fit in an ANOVA table and state your assumptions and conclusions.
- Use the Hospital Stay data in Table 2.11 on p. 39 of Rosner. (HOSPITAL.DAT, HOSPITAL.DOC on the data disk.) Plot duration of stay (dependent variable) against first temperature following admission (independent variable). Fit a straight line to the data and add it to the graph. Summarize the fit in an ANOVA table and state your assumptions and conclusions.

- Use the Hospital Stay data in Table 2.11 on p. 39 of Rosner.
(HOSPITAL.DAT, HOSPITAL.DOC on the data disk.) Give a pairs plot
and a correlation matrix for the following variables: duration of
stay, age, first temperature and first white blood cell count. Fit
the model
**duration ~ age+temp1+wbc1**. Summarize the fit in an ANOVA table. Plot the observed values against the fitted values and add a diagonal line to the plot. Plot the residuals against the fitted values. State your assumptions and conclusions. - Continuing with the Hospital Stay data, fit the model
**duration ~ temp1+age+wbc1**and give the ANOVA table. Discuss how it differs from the previous fit.

- Analyze the following data from a study of ion-beam-assisted
etching of aluminum with chlorine. The independent variable x is
chlorine flow and the dependent variable y is the etch rate. Give
an appropriate graph. State any assumptions you make and do what
you can to test the assumptions. State your conclusions.

Predict the etch rate when flow = 1.75. Give a 95% confidence interval for the residual variance.x

1.5

1.5

2.0

2.5

2.5

3.0

3.5

3.5

4.0

y

23.0

24.5

25.0

30.0

33.5

40.0

40.5

47.0

49.0

- Use the Pulmonary Disease data from problems
**10.90-10.91**on p. 422 of Rosner. Analyse as a 3 x 2 contingency table and give a right-tail*p*-value. State your conclusions.

Statistics 2MA3