Assignment #2 - Hints

Plotting the ROC Curve

The code for plot.roc() I posted originally (Exercise #3 - Hints) will not work when there are missing values (NA) in the data, as in the lead example. Also, in one question you were asked to compute the area under the ROC curve. I have improved the code for plot.roc() to ignore NAs and to compute and return the area.

The calculation of the area is simple: if (x1, y1) and (x2, y2) are the coordinates of two consecutive points defining the curve, the area under that segment of curve is the area of a trapezoid defined by coordinates (x1, 0), (x1, y1), (x2, y2), (x2, 0), which is computed by 0.5 * (y2 + y1) * (x2 - x1), so I wrote a line of code to compute this for all line segments of the curve and sum to get the total area under the curve.

> plot.roc
function (sd, sc)
{
    sall <- sort(c(sd, sc))
    sens <- 0
    specc <- 0
    for (i in 1:length(sall)) {
        sens <- c(sens, mean(sd <= sall[i], na.rm = T))
        specc <- c(specc, mean(sc <= sall[i], na.rm = T))
    }
    plot(specc, sens, xlim = c(0, 1), ylim = c(0, 1), type = "l",
        xlab = "1-specificity", ylab = "sensitivity")
    abline(0, 1)
    npoints <- length(sens)
    sum(0.5 * (sens[-1] + sens[-npoints]) * (specc[-1] - specc[-npoints]))
}

Statistics 2MA3
Last modified 2001-02-18 09:01