Statistics 4M03/6M03 - Assignment #2

2003-11-13

Due: 2003-11-20 19:00


Chapter 2

Problems from Srivastava, Methods of Multivariate Statistics.

2.9.24, 2.9.31, 2.9.32

Typo in 2.9.24: It should also say that y3 = x3.

A suggestion for 2.9.24: Simulate 5000 trivariate standard normal observations (x1, x2, x3) and transform them to (y1, y2, y3). Show the trivariate data (y1, y2, y3) in a brush and spin plot. Make a scatter plot of (y1, y2), make scatter plots of (y1, y2) restricted to narrow ranges of y3, make a histogram and a normal QQ plot of y1. These plots will show you what you are looking for when you derive the distributions theoretically!

Chapter 3

Two of the goodness-of-fit tests developed in Chapter 3 are closely related to the chi-square probability plot method we developed in class:

> normpgof <-
function (x, showrow = F, ...) 
{
    dists <- distsq(x)
    qqchisq(dists[!is.na(dists)], ncol(x), ...)
    if (showrow) 
        rownames(x)[order(dists)]
    else invisible()
}
				
> distsq <-
function (x) 
{
    dif <- as.matrix(sweep(x, 2, apply(x, 2, mean, na.rm = T), "-"))
    diag(dif %*% solve(var(x, na.rm = T)) %*% t(dif))
}

> qqchisq <-
function (x, df, ...) 
{
    plot(qchisq(ppoints(length(x)), df), sort(x), xlab = paste("Quantiles of Chi-sq (",
         df, "df)"), ylab = "Observed quantiles", ...)
    abline(0, 1)
}

(a) Modify the code above to compute Wilk's Test for an Outlier (Section 3.2 on p. 58) and apply it to the data of Example 3.2.1 on p. 59.

(b) Modify the code above to do Small's Graphical Method (Section 3.5.1 on p. 69) and apply it to the data of Example 3.6.1 on p. 74.

Suggestion: Put both methods into one function that plots Small's QQ plot and determines if the largest distance is an outlier. If it is an outlier, use a different plotting symbol for that point. Note that there is an error in the SAS code on page 60, it does not do the test correctly.

Chapter 4

Problems from Srivastava, Methods of Multivariate Statistics.

4.9.2, 4.9.3


Statistics 4M03/6M03