\documentclass{article}
\usepackage[dvips]{graphicx,color}
\usepackage{amsmath}
\usepackage{epsfig}
\usepackage{geometry}
\geometry{letterpaper, left=1in, bottom=1in, right=1in, top=1in,
foot=0.5in, head=0.5in}
\usepackage{Sweave}
\title{a spatial lme example}
\author{Ottar Bj{\o}rnstad}
\date{\today}
\begin{document}
\maketitle

\section*{spatial dependence}
The data contained in {\bf habsel.txt} represents the number of animals captured (\$captures) 
in 71 permanent traps located at co-ordinates \$x and \$y on a small Norwegian island. Also
reported is various habitat variables in the viscinity of each trap. There is a bunch of 
things measured but we will focus on \$lichen (the cover of lichen), \$veg (the total vegetation
cover, \$heather (the cover of heather), \$moss (the cover of moss) and \$stdp (a measure of 
structural complexity).

<<fig=TRUE>>=
habsel=read.table("habsel.txt", header=TRUE)
symbols(x=habsel$x, y=habsel$y, circles=habsel$captures, inches=.1)
@

<<>>=
fit1=lm(log(captures+1)~ lichen+veg+heather+moss+stdp, data=habsel)
summary(fit1)
@

So it looks like there is some abundance-environment association. HOwever, the issue
 with these types of observational data is that there may be strong
spatial autocorrelation that may invalidate a simple regression of abundance against environment.
We can do a quick look at this correlation in residuals from an 'independence' model.
(the log-transformation is to make the capture variable a bit more Gaussian-looking):

<<fig=TRUE>>=
require(ncf)
cres=spline.correlog(x=habsel$x, y=habsel$y, z=resid(fit1), resamp=0)
plot(cres)
@

So there seems to be some spatial autocorrelation. To do the correct model taking
into account the autocorrelation, we use corr-argument
available in mle. First, however, we need to make sure mle understands that all
the data belongs to a single dependence group. And then we fit the model assuming
exponential spatial dependence:

-- OTTAR TALK ABOUT PARAMETRIC CORRELATION FUNCTIONS HERE

<<>>=
require(nlme)
habsel$grp=rep(1, dim(habsel)[1])
fit2=lme(log(captures+1)~ lichen+veg+heather+moss+stdp, data=habsel,
random = ~1|grp,
corr=corSpatial(form=~x+y, type="exponential", nugget=F),
 method="ML")

summary(fit2)
@
Excersize: interpret the results! What does 'range' mean?

We can test for significant spatial correlation by comparing this model with the mle-model without spatial dependence:
<<>>=
fit3=lme(log(captures+1)~ lichen+veg+heather+moss+stdp, data=habsel,
random = ~1|grp, method="ML")
anova(fit2, fit3)
@

Excersize: Fit a new model with 'Gaussian' (not exponential) spatial autorrelation. Which model better fits the data? 

\section*{Some other types of dependence}

\subsection*{random blocks: maternal effects}
The data contained in {\bf mouse.txt} has the following variables (this is a slightly expanded
version of the data we looked at in the Bayes class).

<<>>=
mouse<-read.table("mouse.txt", sep="\t", header=T)
names(mouse)
@
data\$wtX is the weight of offspring at the age of X days; \$sex is 1 = male, 2 = female; \$code is
the indiviudal tag, \$ind is the individual number of the 95 individuals; \$cage is the the
identifier of the litter (as well as the identity of the
mother); \$mweight is the weight of the mother; and \$lsize is the litter size. The biological questions are: (1) Do males and females differ
with respect to their \textit{average} growth. (2) Is there a maternal effect?
Do individuals from the same litter tend to be similar? (3) Do sex effect differ by litter.

let's remove any individual with missing data:
<<>>=
mouse=na.omit(mouse)
@
We might first take an 'empirical' tack on the question (this is not necessary for the lme modelling, but it is interesting).
Lets fit a model for the weight on day 11, ignoring any interdependence between littermates.
Then look whether there is any signature of dependence in the residuals. First fit the model:
<<>>=
fit=lm(wt11~as.factor(sex)+mweight+lsize, data=mouse)
@
To look at autocorrelation recall the definition of correlation: $cor(x,y)=(x-\mu_x) (y-\mu_y)/\sigma_x \sigma_y$.
The following bit of R-code will will calculate the autocorrelation-matrix among all residuals:
<<>>=
scres=(resid(fit)-mean(resid(fit)))/sd(resid(fit))
rcor=outer(scres,scres)
@
And here is one way to flag all littermates (with a 1, and all non-littermates or 'self-comparison' with 0):
<<>>=
mat=outer(mouse$cage, mouse$cage, "-")
mat[mat!=0]=-1
mat=mat+1
diag(mat)=0
@
With this it is easy to calculate the within-litter autocorrelation:
<<>>=
mean(rcor[mat==1])
@
There is clearly a substantial interdependence.

Now lets model the interdependence specifically (and using weight at day 5 as a covariate. To do
this we use the \textbf{lme}-function in the
\textbf{nlme}-package:
<<>>=
library(nlme)
fit=lme(wt11~as.factor(sex)+wt5+mweight+lsize, random=~1|cage, data=mouse)
@
The above is the random-intercept model. \\
Excersize: use summary() to look at the fitted model. What is your interpretation? Advanced Q: What's up
with the degrees-of-freedom for the fixed effects (hint: think of split-plot vs randomized block design)?

Usually we may want to look at whether there is random variability in other parameters. For instance,
we may be interestedin asking: Do mothers who have unusually large female offspring also tend to
have unusually large offspring for the litter size?
To do this (and make sure that
R manages to converge on good values), we have to change some control-parameters in the model:
<<>>=
fit=lme(wt11~as.factor(sex)+wt5+lsize, random=~as.factor(sex)+lsize|cage,
data=mouse, control=lmeControl(maxIter=500,msMaxIter=500, opt="optim"))
@
Excersize: use summary() and random.effects() to answer:
\textit{Do mothers who have unusually large female offspring also tend to
have unusually large offspring for the litter size?}

To examine the model fit, we can use:
<<>>=
plot(fit, wt11 ~ fitted(.) | cage, abline = c(0,1))
@

\subsection*{repeated measures}
Of course, the full data-set is really a repeated measures data set (each individual was weighed on
many occations). mle() can help us the full model if we want to (we first reshape the data):

<<>>=
mouse2=reshape(mouse, idvar="ind", varying=list(names(mouse)[7:18]),
        v.names="wt", direction="long", times=c(0,1,2,3,4,5,7,9,11,13,15, 18))
fit=lme(wt~as.factor(sex)+time+lsize, random=~time|cage/ind,
     data=mouse2, control=lmeControl(maxIter=500,msMaxIter=500, opt="optim"))
@

Iterpret the result (NB! |cage/ind means that individual is nested within cage). Is
most of the random variation in growth at the maternal level or the individual level?
\end{document}