Statistics 2MA3 - Exercise #3

2001-02-07


Plotting ROC Curves

Write a function plot.roc() in R to plot ROC curves like the one we did in class on February 2. Suppose the scores on the diagnostic test for the disease group are in a vector scored and the scores for the control group are in a vector scorec; then

scored <- c(70, 61, 52, 37, 33)
scorec <- c(90, 83, 72, 59, 50)
plot.roc(scored, scorec)

should plot the curve.

Suppose a diagnostic test score follows a N(50, 100) distribution in the control group and a N(40, 100) distribution in the disease group. What would the ROC curve look like? Simulate, say, 100 patients from the disease group and 50 patients from the control group and use your function again:

plot.roc(rnorm(100,40,10), rnorm(50,50,10))

How does the ROC curve change if the disease group were closer to the control group? Try N(45, 100). What if it were further away? Try N(30, 100).

Choosing a Cut-off Point

When the score on a diagnostic test is measured on a continuous scale, we have to choose a suitable cut-off point that gives a good compromise between sensitivity and specificity. Suppose the score is X and X ~ N(m, s2) in the control group and X ~ N(m-D, s2) in the disease group and suppose that the prevalence of the disease in the population is p. Show that if a cut-off a is used, so that someone is considered to test positive for the disease if X < a and negative otherwise, then the total probability of misclassification is given by

p[1 - F((a - m + D)/s)] + (1 - p) F((a - m)/s)

Hence show that the overall rate of misclassification will be minimized when

a = m - D/2 + (s2/D) log[p/(1-p)]

Note that even though this is an "optimal" decision rule, it is not necessarily a good rule. If the test does not discriminate well between the control and disease groups (that is, if D is small relative to s2) and the disease is rare (that is, if p is small), then the overall rate of misclassification is minimized by assigning everyone to the non-disease group.

Problems from Rosner

3.68-3.75, 3.94-3.97, 6.75-6.77


Statistics 2MA3