Assignment #1 - Hints

Plotting on logarithmic scales

If you want to plot the time series pcb.solid in data frame niagara on a logarithmic scale, you could use either of the following:

plot(log(niagara$pcb.solid), type = "l")
plot(niagara$pcb.solid, type = "l", log = "y")

With the first method, log units will be displayed on a linear grid on the Y-axis. The second method, with log = "y", will display the original units on a logarithmic grid on the Y-axis. Which do you prefer? You could also specify log = "y" or log = "xy" depending on which axis or axes you want transformed.

Plotting Symbols

If you are getting tired of the default open circles for plotting points, try different values of pch in the plot() command; pch = 16, for instance, gives filled circles.

Autocorrelation

Most of the statistical methods we will be studying in this course assume that the observations in a sample are independent, so it is important to be able to tell when that assumption isn't satisfied.

The most common form of dependence comes when observations are made sequentially and observations close together in the sequence are related. For example, if the exchange rate of the Canadian dollar is observed daily and it is trading at an above average rate today, it will probably also be above average tomorrow: this is called a positive lag-1 autocorrelation, where "lag 1" refers to the one-day time step.

Here is a simple exercise to show how autocorrelation may be detected. Begin by generating a sample of 200 independent standard normal observations.

> normind <- rnorm(200)

To generate 200 autocorrelated observations, try an "autoregression" model, that is, make each observation a linear combination of the previous observation and a new independent standard normal error. It can be shown that setting both weights to 1/sqrt(2) gives unit variance and makes the lag-1 autocorrelation equal to 1/sqrt(2). Note the use of rep() to initialize the vector of observations, and the use of a for() loop to fill in the values after the first one.

> normdep <- rep(0,200)
> normdep[1] <- rnorm(1)
> for(i in 2:200) normdep[i] <- (normdep[i-1]/sqrt(2))+(rnorm(1)/sqrt(2))

Try the following displays and statistics with the independent and dependent samples: observations, histogram, mean, variance, sequence plot, lag-1 scatterplot, correlation coefficient for lag-1 autocorrelation. Which displays and which statistics reveal the autocorrelation? Which ones do not?

> normind
> normdep
> hist(normind)
> hist(normdep)
> mean(normind)
> mean(normdep)
> var(normind)
> var(normdep)
> plot(normind,type="l")
> plot(normdep,type="l")
> plot(normind[-200],normind[-1])
> plot(normdep[-200],normdep[-1])
> cor(normind[-200],normind[-1])
> cor(normdep[-200],normdep[-1])

Statistics 2MA3
Last modified 2001-01-24 09:03