1998-01-09 (Week 1)

UNIX Prerequisites

Log on to an X-terminal, set up screen and printer graphics, learn the vi editor, learn basic file management commands (ls, mkdir, rmdir, cd, rm, etc.), learn how to move files between different computers.


S-plus Demonstrations

Plot a line graph of the standard normal probability density function and add a title to the graph.

> x <- seq(-3, 3, by = 0.1)
> plot(x, dnorm(x), type = "l", main = "Standard Normal pdf")
 

Write a function to prepare a grid, compute the standard bivariate normal pdf over the grid, and display it as either a perspective plot or a contour plot.

Note: the scaling factor 1/(2pi) has been omitted in this calculation as it does not affect the shape of the distribution. Look at the first few rows of xynor; what did expand.grid do? What is the purpose of the "0" on the last line?

> norm3d
function(view = "p")
{
        xnor <- seq(-3, 3, by = 0.1)
        ynor <- xnor
        xynor <- expand.grid(list(xnor, ynor))
        znor <- matrix(exp(-0.5 * (xynor[, 1]^2 + xynor[, 2]^2)), nrow = length(xnor))
        if(view == "p")
                persp(xnor, ynor, znor)
        if(view == "c")
                contour(xnor, ynor, znor)
        0
}
 

Write a function to display a series of histograms for samples of size n from the chi-square distribution on 3 degrees of freedom. The number of histograms shown is nr.

Note: In order to have the same breakpoints for each histogram, the steps go from 0 to 13 in user-specified increments (step) and observations greater than 13 are ignored.

> pdfmovie
function(nr = 1, n = 100, step = 1)
{
        for(i in c(1:nr)) {
                x <- rchisq(n, 3)
                hist(x[x < 13], breaks = seq(0, 13, by = step))
        }
        n
}
 

Obtain the trees95 data studied in this course last year and do some exploratory data analysis.

Save the data in a file called trees95.txt. Use

> trees95 <- read.table("trees95.txt",header=T,row.names="stand")

to read data from the file into a data frame called trees95. Attach the data frame to your session. Confirm the names of the 5 variables (column names). Look at the first 6 rows. Draw a scatterplot matrix for the 4 quantitative variables (height, diameter, satellite band5, crown cover index), for all stands of trees, for the undamaged stands alone, and for the damaged stands alone.

> attach(trees95)
> dimnames(trees95)[2]
[[1]]:
[1] "ht"   "dbh"  "bnd5" "cci"  "type"
 
> trees95[1:6,]
    ht dbh bnd5        cci      type
1 1.00  NA  102 0.04146902 undamaged
2 1.21  NA   99 0.05835823 undamaged
3 1.21  NA   99 0.06029240 undamaged
4 1.18  NA  102 0.06414085 undamaged
5 1.03  NA  103 0.04427970 undamaged
6 1.06  NA  104 0.05017752 undamaged
 
> pairs(trees95[, c(1:4)])
> pairs(trees95[type == "undamaged", c(1:4)])
> pairs(trees95[type == "damaged", c(1:4)])
 

Draw a box plot to compare crown cover index in the damaged and undamaged stands.

> boxplot(split(cci, type))


What You Should Try Next

Look over the examples above, try them, and see what each detail of each command does.

Click here to see examples of the graphical output you will get.

Practice creating and editing an S function with the vi editor. Use the fix() function if you are using S-plus on data.

Study the scatterplot matrices for the tree data. What do they tell you about the relationships between the four variables? Can the satellite data be used to predict crown cover? What if you also knew the age of each stand? Be prepared to discuss this next week.

Get the 1998 SSC Case Study data sets from the SSC web site, and set them up as data frames in S. Plot some graphs.


Back to S4P03