Exercise #1 - Hints

Computing change in price

My data frame was called petrol. I computed the change in price as

petrol$diff <- c(petrol$Sunoco[-1] - petrol$Sunoco[-430], NA)

which could also be written

petrol$diff <- c(petrol$Sunoco[2:430] - petrol$Sunoco[1:429], NA)

Remember that prices change during the day, and the prices were observed early each morning. The first element of diff will be the price on day 2 (Wed) minus the price on day 1 (Tues), which is the amount by which the price changed on day 1 (Tues). This calculation computes 429 differences. There is no price change available for the last day of the series, so I used c() to put NA at the end of the column because the column of differences has to be the same length as the other columns in the data frame.

The lag plot

You were asked to plot today's price on the Y-axis against yesterday's price on the X-axis, for each day. This is called a lag plot of lag -1. Since plot() takes x as its first argument and y as its second, the correct call will be

plot(petrol$Sunoco[-430], petrol$Sunoco[-1])

If you load the time series library, you can do the same more easily with the lag.plot() function.

library(ts)
lag.plot(petrol$Sunoco, lag = -1)

Working with days of the week

Displays like comparative box plots will, by default, put the categories in alphabetical order, so the days of the week will come out as F M R S Su T W.

One work-around is to recode the days Su through S as numbers 1 through 7, or letters a through g. This is a useful exercise, as you should know how to recode data.

A more elegant method is to force boxplot() to take the categories in the order you want:

boxplot(split(petrol$diff, petrol$day)[c("Su", "M", "T", "W", "R", "F", "S")])

Remember that R is case-sensitive; "Su" and "SU" aren't the same, for example.


Statistics 2MA3 Statistics 3N03
Last modified 2002-01-19