GLMMs: worked examples
========================================================

`r as.character(Sys.time())`

```{r opts,echo=FALSE}
library("knitr")
opts_chunk$set(tidy=FALSE,fig.width=6,fig.height=4)
```

These are worked examples for a forthcoming book chapter on mixed models (in *Ecological Statistics: Contemporary Theory and Application*, editors Negrete, Sosa, and Fox).

I use a very large set of packages in this example, because I am trying out a wide range of approaches. You should whittle these down to just the ones you need for any given analysis ...
```{r pkgs,message=FALSE,warning=FALSE}
## primary GLMM-fitting packages:
library(lme4)
library(glmmADMB)
library(MCMCglmm)
library(blme)
library(MASS)          ## for glmmPQL
library(nlme)          ## for intervals()
## auxiliary
library(ggplot2)       ## for pretty plots generally
## ggplot customization
theme_set(theme_bw())
library(grid)          ## for unit()
zmargin <- theme(panel.margin=unit(0,"lines"))
library(scales)        ## for squish()
library(gridExtra)     ## for grid.arrange()
library(proto)         ## for horizontal line range plot
source("geom-linerangeh.R")  ## for horizontal line ranges
## to squash facets together ...
zmargin <- theme(panel.margin=unit(0,"lines"))
library(coefplot2) ## coefficient plots
library(coda)      ## MCMC diagnostics
library(aods3)     ## overdispersion diagnostics
library(scapeMCMC) ## pretty plots from MCMC fits
library(bbmle)     ## AICtab
library(pbkrtest)  ## parametric bootstrap
library(Hmisc)
## library(plotrix)  ## for plotCI
## library(emdbook)  ## for curve3d()
## for general-purpose reshaping and data manipulation:
library(reshape2)
library(plyr)
library(numDeriv)
```

Package versions used (core packages only):
```{r pkgversion,echo=FALSE}
pkgs_used <- c("lme4","glmmADMB","MCMCglmm","blme",
               "pbkrtest","coefplot2","coda","aods3","bbmle")
vers <- sapply(pkgs_used,function(x) as.character(packageVersion(x)))
print(vers,quote=FALSE)
```

## *Culcita*

### Data exploration

The basic data can be reduced, for the purposes
of this exercise, to a single treatment (`ttt`)
[which consists of combinations of different
symbionts: crab, shrimp, both or neither]; a binary response (`predation`);
and a blocking factor (`block`).

```{r getdat}
load("culcita.RData")
summary(culcita_dat)
```
Confirm that this is a randomized block design with
2 replications per treatment per block:
```{r exptab}
with(culcita_dat,table(ttt,block))
```
(the `ftable()` function can be handy for displaying experimental designs with more than two levels).

Plot summary statistics (mean and bootstrap 95% CI) for treatments, ignoring block structure:

```{r plot1,message=FALSE}
ggplot(culcita_dat,aes(x=ttt,y=predation))+
  stat_summary(fun.data=mean_cl_boot,size=2)+
  ylim(c(0,1))
```

The basic conclusion is that symbionts have a definite protective
effect; the combination of two symbionts seems *slightly* more
protective than a single symbiont.  However, we have to see if
this holds up when we account for among-plot variation.  

I have yet to find a good way to visualize these data that displays
the among-block variation.
This is my best attempt, but it's more artistic than compelling:

```{r plot2,message=FALSE}
ggplot(culcita_dat,aes(x=ttt,y=predation,colour=block,group=block))+
    stat_summary(fun.y=sum,geom="line",alpha=0.4)+
    stat_summary(fun.y=sum,geom="point",alpha=0.7,
                 position=position_dodge(width=0.25))
```

### Fitting

First, fit the model in each framework.

#### lme4::glmer

(The syntax `pkg::fun` refers in general to a function `fun` in the R package `pkg`.)

```{r lme4_0,message=FALSE,cache=TRUE}
cmod_lme4_L <- glmer(predation~ttt+(1|block),data=culcita_dat,
               family=binomial)
```

Take a quick look at the results:
```{r lme4_results}
print(summary(cmod_lme4_L),correlation=FALSE)
```

It would be nice to fit the model with a random effect of treatments across blocks as well, but it takes a long time and warns that it has failed to converge ...
```{r c_block,cache=TRUE}
cmod_lme4_block <- update(cmod_lme4_L,.~ttt+(ttt|block))
```
(we could extend the maximum number of iterations by using (e.g.)
`control=glmerControl(optCtrl=list(maxfun=1e5))`, but it probably
wouldn't help).

If we examine the parameter estimates from this fit, we see various additional indications of trouble. The `crab` parameter and several of the variance-covariance parameters are very large and some of the variance-covariance parameters are very strongly correlated with each other, both indications that the model is overfitted:
```{r c_block_est}
fixef(cmod_lme4_block)
VarCorr(cmod_lme4_block)
```
We can also use model comparison to see that it's not worth
adding the extra terms:
```{r AICcomp1}
AICtab(cmod_lme4_L,cmod_lme4_block,nobs=nrow(culcita_dat))
```
(If we use AICc instead of AIC `cmod_lme4_block` does even worse,
by 6 AICc units ...)

Fitting with Gauss-Hermite quadrature:
```{r c_AGQ}
cmod_lme4_AGQ10 <- update(cmod_lme4_L,nAGQ=10)
```

```{r c_AGQ_all,echo=FALSE,cache=TRUE}
agqfun <- function(i) {
    f <- update(cmod_lme4_L,nAGQ=i)
    c(fixef(f),sqrt(unlist(VarCorr(f))))
}
agqvec <- 1:25
agqres <- sapply(agqvec,agqfun)
```

```{r fix_agqres,echo=FALSE}
dimnames(agqres) <- list(c("intercept","crabs","shrimp",
                           "both","sd.block"),agqvec)
## melt and restore order of parameter labels
m <- transform(melt(agqres),
               Var1=factor(Var1,levels=rownames(agqres)))
```

The next plot (showing the values of each parameter
as a function of the number of quadrature points)
shows that using more quadrature points
does have some effect, but on an absolute
scale the parameters don't vary much: probably the biggest effect
is in the estimated among-block standard deviation, increasing
irregularly from `r round(agqres["sd.block",1],2)` for the
Laplace approximation ($n=1$) to an asymptote of 
`r round(agqres["sd.block",25],2)`.

```{r c_AGQ_plot,fig.width=10,echo=FALSE}
ggplot(m,aes(x=Var2,y=value))+
    geom_line()+facet_wrap(~Var1,scale="free",nrow=1)+
    labs(x="Number of adaptive Gauss-Hermite\nquadrature points")
```

#### glmmADMB

The only difference in moving from `lme4` to `glmmADMB` is that you have to express the `family` argument as a quoted string (i.e., `family="binomial"` rather than `family=binomial`):

```{r glmmADMB_culc}
cmod_gA_L <- glmmadmb(predation~ttt+(1|block),data=culcita_dat,
               family="binomial")
```

The `summary()` for `glmmADMB` is nearly identical.

```{r checkSE,echo=FALSE,eval=FALSE}
library(numDeriv)
## compute SEs from Hessian
hfun <- function(model,incTheta=TRUE,...) {
    dd <- update(model,devFunOnly=TRUE)
    if (!incTheta) {
        dd2 <- dd
        th <- getME(model,"theta")
        dd <- function(x) {
            dd2(c(th,x))
        }
        v <- getME(model,"beta")
    } else     v <- unlist(getME(model,c("theta","beta")))
    h <-  hessian(dd,v,...)/2
    sqrt(diag(solve(h)))
}
h <- hfun(cmod_lme4_L)
h2 <- hfun(cmod_lme4_L,incTheta=FALSE)
cmat <- cbind(h[-1],
              h2,
              coef(summary(cmod_lme4_L))[,"Std. Error"],
              coef(summary(cmod_gA_L))[,"Std. Error"])
colnames(cmat) <- c("hess","hess2","lme4","glmmADMB")
cmat
```

```{r glmmADMB_culc_sum}
summary(cmod_gA_L)
```

#### MCMCglmm

The `MCMCglmm` random effect specification is a little different
from `glmer` and `glmmadmb`: other
differences include
* the Bernoulli (binomial with $N=1$) response is specified as `family="categorical"` (you would code a binomial response with $N>1$ as a two-column matrix of successes and failures, as in `glm()`, and specify `family="multinomial2"`)
* After looking at the diagnostics on the default fit (see below),
I decided to 
    * Set priors. The priors for the variance-covariance matrices are specified
	as *inverse-Wishart* distributions; the corresponding
	[Wikipedia page](http://en.wikipedia.org/wiki/Inverse-Wishart_distribution)
	explains that for a single variance, the inverse-Wishart
	reduces to an inverse-gamma distribution with shape parameter
	$\alpha$ equal to half the Wishart shape ($\nu$) parameter. (The [inverse-gamma distribution](http://en.wikipedia.org/wiki/Inverse_gamma) is in turn a distribution of a random variable $X$ where $1/X$ is [Gamma-distributed](http://en.wikipedia.org/wiki/Gamma_distribution) ...)
	For the observation-level variance (`R`) I used
	a prior variance with mean 1 and a shape parameter of 10, i.e. $1/\sigma^2$ is Gamma-distributed
		with a mean of 1 and coefficient of variance $1/\sqrt{5}$ 
			(This is a moderately strong prior.) For the among-group variance (`G`) I used
	a mean approximately equal to the among-block variance we got
	from `glmer`/`glmmADMB` and a shape parameter of 1 (weak).
	* Specify the number of iterations (`nitt`), number
of iterations to discard (`burnin`), and thinning (`thin`) manually,
to get a much longer chain than the defaults (`nitt=13000`, `thin=10`, `burnin=3000`).

⌛ (60 seconds): *in general I will use this hourglass icon to indicate code that might take a long time to run -- you might want to skip running these chunks.*
```{r MCMCglmm1,cache=TRUE}
cmod_MG0 <- MCMCglmm(predation~ttt,
                    random=~block,data=culcita_dat,
                    family="categorical",verbose=FALSE)
cmod_MG1 <- MCMCglmm(predation~ttt,
                    random=~block,data=culcita_dat,
                    family="categorical",verbose=FALSE,
                    nitt=5e5,burnin=5e4,thin=100)
```
As we will see below, simply running for longer won't fix everything.
```{r MCMCglmm_longrun}
prior.c <- list(R=list(V=1,fix=1),
                G=list(G1=list(V=1, nu=1, alpha.mu=0, alpha.V=1000)))
cmod_MG <- MCMCglmm(predation~ttt,
                    random=~block,data=culcita_dat,
                    family="categorical",verbose=FALSE,
                    nitt=13e4,burnin=3e4,thin=50,
                    prior=list(R=list(V=1,nu=10),
                    G=list(list(V=11,nu=1))))
```

```{r mcmcglmm_sum}
(cmod_MGsum <- summary(cmod_MG))
```

**FIXME**: consider talking about log link here?

### Diagnostics and model summaries

#### lme4

Diagnostic plots:
```{r cmod_lme4_L_diagplot,fig.width=8}
p1 <- plot(cmod_lme4_L,id=0.05,idLabels=~.obs)
p2 <- plot(cmod_lme4_L,ylim=c(-1.5,1),type=c("p","smooth"))
grid.arrange(p1,p2,nrow=1)
```
The only thing that the default (left-hand)
diagnostic plot tells us is that observation \#20
has a (very) extreme residual (we use
`idLabels=~.obs` to get outliers labelled by their
observation number; otherwise, by default, they are
labelled by their groups); if we use `ylim=c(-1.5,1)`
to limit the $y$-range,
we get (on the right) the usual not-very-informative residuals plot expected
from binary data.

### Digression: complete separation

If we exclude observation 20, we see the symptoms of [complete separation](http://www.ats.ucla.edu/stat/mult_pkg/faq/general/complete_separation_logit_models.htm)

**FIXME**: *It would be worth re-doing the fit without the extreme point
to make sure the results are not sensitive.  But this causes severe
instability (because block 10 then gets 100% predation??) ...*
```{r refit_outlier,cache=TRUE,echo=FALSE}
## subset(fortify.lmerMod(cmod_lme4_L),abs(.resid)>10)
newdat <- subset(culcita_dat,abs(resid(cmod_lme4_L,"pearson"))<10)
cmod_lme4_L2 <- update(cmod_lme4_L,data=newdat)
```

```{r}
coefplot2(cmod_lme4_L,intercept=TRUE)
```

We use
```{r constr}
cmod_blme_L2 <- bglmer(predation~ttt+(1|block),data=newdat,
                       family=binomial,
                       fixef.prior = normal(cov = diag(9,4)))
```

(*End of digression*)

We can do a little bit better with boxplots grouped by
treatment (again limiting the range to exclude the outlier).

```{r cmod_lme4_L_boxplot}
plot(cmod_lme4_L,ttt~resid(.,type="pearson"),xlim=c(-1.5,1))
```

Check the random effects:
```{r cmod_ranef}
dotplot(ranef(cmod_lme4_L,condVar=TRUE))
```
There are only a few unique values 
of the random effects because there are only a few
possible configurations per block of predation/no-predation in
each treatment.

It's worth checking `methods(class="merMod")` to see all the things
you can do with the fit (some of them may be a bit obscure ...)
The most important are:
* `fixef()` to extract the vector of fixed-effect parameters (confusingly, 
`coef()` -- which is the accessor method for finding coefficients for
most other models in R -- gives a matrix showing the estimated coefficients
for each block (incorporating the random effects), which I don't find
useful very often)
* `coef(summary(.))` to extract just the table of estimates with standard errors,
$z$ values, and $p$-values
* `VarCorr()` to extract the estimates of the random effect variances
and covariances. (Use `unlist(VarCorr(.))` if you want the variances
as a vector.)
* `ranef()` to extract estimates of the random effects, and `dotplot(ranef(.,condVar=TRUE))` or `qqmath(ranef(.,condVar=TRUE))` to explore them graphically
* `plot()` for diagnostic plots
* `AIC()`, `anova()`, `drop1()`, `confint()`, `profile()` for various statistical tests
* `predict()` for predictions; `simulate()` for simulations based on the model

#### glmmADMB

The methods and diagnostics for `glmmADMB` are similar, although
not quite as well developed.
You can still create the same kinds of diagnostic plots
(I have done this one with `ggplot2` rather than `lattice`):

**FIXME**: *Pearson residuals are quite different -- bug in `glmmADMB` ?
Or ... ??*

```{r glmmADMB_diag}
augDat <- data.frame(culcita_dat,resid=residuals(cmod_gA_L,type="pearson"),
                     fitted=fitted(cmod_gA_L))
ggplot(augDat,aes(x=ttt,y=resid))+geom_boxplot()+coord_flip()
```

#### MCMCglmm

Within an `MCMCglmm` fit the chains are stored in two separate
matrices (`mcmc` objects, actually, but these can be treated
a lot like matrices) called `Sol` (fixed effects) and
`VCV` (variances and covariances).  (Random effects are not
saved unless you set `pr=TRUE`.)
```{r MCMCglmm_chains1}
allChains <- as.mcmc(cbind(cmod_MG0$Sol,cmod_MG0$VCV))
```
The trace plots for the default parameters:
```{r MCMCglmm_diag}
plotTrace(allChains)
```
(`plotTrace` is from the `scapeMCMC` package, and is slightly
prettier than the default trace plot you get from `xyplot(allChains)`,
or the trace+density plot you get from `plot(allChains)`, but they
all show the same information.)
This trace plot is terrible: it indicates at the very least that we
need a longer burn-in period (to discard the transient)
and a longer chain with more thinning (to make the chains
look more like white noise). It's also worrying that the `units`
variance appears to be get stuck near zero (although here the
problem may be with the large transient spikes 
distorting the scale and making the
rest of the chain look non-variable).

Running longer:
```{r MCMCglmm_diag2}
allChains2 <- as.mcmc(cbind(cmod_MG1$Sol,cmod_MG1$VCV))
plotTrace(allChains2,axes=TRUE,las=1)
```

This looks OK, but the magnitudes of the parameters (suppressed
by default from the `tracePlot` results since we're initially
interested only in the pattern, not the magnitude, of the
parameters) suggest some sort of problem.

The fundamental problem here is that, for both technical
and philosophical reasons, `MCMCglmm` always adds an
observation-level variance (referred to in `MCMCglmm`
as the "R-structure", for "residual structure"),
corresponding to an overdispersion term. For binary
data,
$$
\text{logit}(p_i) = m_i + \epsilon_i
$$
where $\epsilon_i \sim N(0,\sigma^2)$
then the *expected value* of $p_i$ is
the average of $\text{logistic}(m_i+\epsilon_i)$.
This is a nonlinear average, and so the mean of $p_i$
is *not* equal to $\text{logistic}(m_i)$ (this
phenomenon is  called [Jensen's inequality](http://en.wikipedia.org/wiki/Jensen%27s_inequality); [Ruel and Ayres 1999](www.dartmouth.edu/~mpayres/pubs/Jensen.PDF) give an ecologist-friendly explanation).
If $\ell(x)=1/(1+\exp(-x))$ is the logistic function then
the mean of $p_i$ is approximately
$$
\overline{p_i} \approx \ell(m_i) + 
\left.\frac{d^2\ell}{dx^2}\right|_{x=m_i} \cdot \sigma^2/2
$$

We won't bother to show the math of $d^2\ell(x)/dx^2$ here:
suffice it to say that this term varies from about +0.1 to -0.1
depending on the baseline value $m_i$, so that the estimates
of $m_i$ and the observation-level variance $\sigma^2$ are
confounded:
```{r d2logis,echo=FALSE}
## deviation of mean(m+eps) from m
## int (m + dL/dm eps + dL^2/d^2m eps^2/s) =
## m + dL^2/d^2m * v/2
## 1/(1+exp(-x))
## dlogis = exp(-x)/(1+exp(-x))^2  = exp(-x) plogis(x)^2
## d2logis = -exp(-x) plogis(x)^2 + exp(-x) *2*plogis(x)*dlogis(x)
##         = exp(-x)*plogis(x)^2*(-1 + 2*dlogis(x)/plogis(x))
##         = dlogis(x)*(2*dlogis(x)/plogis(x)-1)
d2logis <- function(x) dlogis(x)*(2*dlogis(x)/plogis(x)-1)
if (FALSE) {
    ## check
    library(numDeriv)
    grad(dlogis,x=2)
    d2logis(2)
}
par(las=1,bty="l")
curve(d2logis(x),from=-3,to=3,
      xlab=~d^2*L/d*x^2)
abline(h=0,lty=3)
abline(v=0,lty=3)
```

```{r MCMCglmm_diag_sum,echo=FALSE}
bb <- cmod_MGsum$Gcovariances["block",]
bb2 <- summary(cmod_MG1)$Rcovariances["units",]
apvar <- round(bb2["post.mean"]/100)*100
## mean posterior
## 37.3     3.02      106      147
```

In the MCMC example here, the observation-level
variance has drifted to a very large value
($\sigma^2_R \approx$ `r apvar`), and the initial
(variance-free) value of the baseline predation probability was low,
so the variance inflates the mean by a great deal and the 
intercept parameter becomes strongly negative to compensate.

To deal with the confounding between these variables, we
set a fairly strong prior on $\sigma^2_R$ as described above.
The results:
```{r MCMCglmm_diag3}
allChains3 <- as.mcmc(cbind(cmod_MG$Sol,cmod_MG$VCV))
plotTrace(allChains3,axes=TRUE,las=1)
```

Now the magnitudes of the parameters look a little bit more sensible.
The chains for the variances are a little
bit "spiky" because the posterior densities have 
long right tails, so it can be helpful to look at them on a log scale
(it's easier to see the details when the distribution is symmetric):
```{r MCMCglmm_diag_log}
vcChain <- log10(cmod_MG$VCV)
plotTrace(vcChain)
```

The `units` (observation-level) variance is reasonably well-behaved,
but it more or less has to be since its prior is constrained.
The `block` variance is more worrying: the trace plot still shows
some pattern (we want it to be more or less indistinguishable
from white noise), and the
effective sample size (`r round(bb["eff.samp"])`) is much smaller
than our nominal sample size of `r nrow(vcChain)`.

### Inference

We get the values of the parameters and
the (Wald) standard errors, $Z$- and $p$-values
(for `glmer()` and `glmmADMB()`), or 
the (quantile-based) credible intervals and
"pMCMC" values from the `summary()` output, as shown above.

If we want more accurate profile confidence interval,
parametric bootstrap, or MCMC-based confidence intervals
for `glmer()`- or `glmmADMB()`-based analyses, we use
`confint()`:

⌛ (**does this cause problems?? use an inline image instead??)
Compute profile confidence intervals:
```{r CIhide,message=FALSE,cache=TRUE,echo=FALSE,warning=FALSE}
t_lme4_prof <- system.time(cmod_lme4_L_CI_prof <- confint(cmod_lme4_L))
t_lme4_CI_quad <- system.time(
  cmod_lme4_L_CI_q <- confint(cmod_lme4_L,method="Wald"))
t_lme4_CI_boot <- system.time(
  cmod_lme4_L_CI_boot <- confint(cmod_lme4_L,method="boot"))
```

```{r CI,eval=FALSE}
cmod_lme4_L_CI_prof <- confint(cmod_lme4_L)
cmod_lme4_L_CI_quad <- confint(cmod_lme4_L,method="Wald")
cmod_lme4_L_CI_boot <- confint(cmod_lme4_L,method="boot")
```

```{r assemble_cmod,echo=FALSE}
c0 <- setNames(coeftab(cmod_lme4_L)[,c("Estimate","2.5%","97.5%")],
               c("est","lwr","upr"))               
c1 <- rbind(c0,
            data.frame(est=sqrt(unlist(VarCorr(cmod_lme4_L))),
                       lwr=NA,upr=NA))
c1_prof <- c1
c1_prof[,2:3] <- cmod_lme4_L_CI_prof[c(2:5,1),]
c1_boot <- c1
c1_boot[,2:3] <- cmod_lme4_L_CI_boot[c(2:5,1),]
ff <- function(x,CI) {
    data.frame(var=rownames(x),x,CI=CI)
}
c2 <- do.call(rbind,mapply(ff,list(c1,c1_prof,c1_boot),
                           list("Wald","profile","boot"),SIMPLIFY=FALSE))
rownames(c2) <- NULL
c2 <- data.frame(c2,fun="glmer")
c3 <- rbind(setNames(data.frame(coef(cmod_gA_L),confint(cmod_gA_L)),
                     c("est","lwr","upr")),
            block=data.frame(est=unname(sqrt(cmod_gA_L$S$block)),
                       lwr=NA,
                       upr=unname(sqrt(cmod_gA_L$S$block+1.96*cmod_gA_L$sd_S$block))))
c3 <- data.frame(var=rownames(c3),c3,CI="Wald",fun="glmmADMB")
ss <- summary(cmod_MG)
cc <- c("post.mean","l-95% CI","u-95% CI")
c4 <- rbind(setNames(data.frame(rownames(ss$solutions),
                       ss$solutions[,cc]),
                                c("var","est","lwr","upr")),
           setNames(data.frame(var=rownames(ss$Gcovariances),
                               sqrt(ss$Gcovariances[,cc,drop=FALSE])),
                               c("var","est","lwr","upr")))
c4 <- data.frame(c4,CI="MCMC",fun="MCMCglmm")                       
cmod_Results <- rbind(c2,c3,c4)
```

**FIXME**: still needs work! facet? Wald CI for block sd?

The parameter estimates and confidence intervals are reasonably
consistent across packages and algorithms.
* `glmer` and `glmmADMB` give the same point estimates -- reassuringly,
since they are both using (different implementations of) the Laplace
approximation method. `MCMCglmm` gives slightly lower (= more negative)
estimates of the point estimates, and higher estimates of the
among-block standard deviation; this is in part because `MCMCglmm`
is reporting the posterior mean, which for a distribution with a long
right tail is larger than the posterior mode (analogous to the
maximum likelihood estimate).
* the likelihood profile and Wald CIs are similar (`glmmADMB`'s Wald intervals are slightly wider than `glmer`'s, because `glmer`'s calculation does not account for uncertainty due to the random-effects parameters; this is not
really obvious from the plot).
* The `MCMCglmm` confidence intervals are a bit wider, and the parametric bootstrap confidence intervals are extremely wide (they're truncated at $\pm 15$ for display purposes).  For some parametric bootstrap realizations, the simulated predation outcomes are such that complete separation occurs (some blocks, or some treatments, are either always or never attacked by predators), leading to infinite estimates of the parameters (log-odds differences). This is not a big practical difficulty because the direction of the uncertainty is toward more extreme results ...

```{r cmod_plotResults,echo=FALSE,warning=FALSE}
ggplot(cmod_Results,aes(x=var,y=est,ymin=lwr,ymax=upr,
                        colour=fun,linetype=CI))+
    geom_pointrange(position=position_dodge(width=0.5))+
    scale_y_continuous(lim=c(-15,15),oob=squish,expand=c(0,0))+
    coord_flip()
```

## Gopher tortoise

### Data 


```{r gopherdat}
load("gopherdat2.RData")
ggplot(Gdat,aes(x=prev,y=1+shells/Area))+
    stat_sum(aes(colour=factor(year),
                 size=factor(..n..)))+
    scale_size_discrete(range=c(3,6),name="overlap")+
    scale_y_log10()+
    scale_colour_discrete(name="year")+
    geom_line(aes(group=Site),alpha=0.2)+
    labs(x="Seroprevalence",y="1+shells/Area (log scale)")
```
There does indeed seem to be a general upward trend 
in shell density as a function of seroprevalence,
but the rest of the plot is pretty messy.  Lines
connect observations from the same site (in an earlier
version I used `geom_text(aes(label=Site))` to label the points by site:
this is useful for diagnostic purposes, but ugly).

### Fitting

#### glmer

Fit basic model with a log-linear effect of prevalence, an
offset of `log(Area)` (to make the expected number of shells
proportional to `Area`), `year` as a fixed categorical effect,
and `Site` as a random effect.

```{r gopherfit1}
gmod_lme4_L <- glmer(shells~prev+offset(log(Area))+factor(year)+(1|Site),
      family=poisson,data=Gdat)
```

```{r gophersum1}
summary(gmod_lme4_L)
```

This initial summary shows that the site-level variance is exactly
zero (corresponding to pooling our data/ignoring the effect of `Site`:
if you want to inspect this component of the summary by itself,
you can specify `VarCorr(gmod_lme4_L)`).
For comparison, we can also explicitly fit (non-mixed) generalized
linear models with complete pooling and fixed effects of `Site`,
respectively:

```{r gopherglm}
gmod_glm <- glm(shells~prev+offset(log(Area))+factor(year),
      family=poisson,data=Gdat)
gmod_glm2 <- glm(shells~prev+offset(log(Area))+factor(year)+factor(Site),
      family=poisson,data=Gdat)
```

Try the fit with Gauss-Hermite quadrature for comparison
(perhaps GHQ will estimate a non-zero variance?)
```{r gopherfit2}
gmod_lme4_agq <- update(gmod_lme4_L,nAGQ=10)
```

No:
```{r gfit2_vc}
VarCorr(gmod_lme4_agq)
```

```{r gfit_AICtab}
AICtab(gmod_glm,gmod_glm2,gmod_lme4_L,logLik=TRUE)
```
(At present `lme4` computes log-likelihood/AIC
values for models fitted with GHQ that are not
comparable with `glm()` fits or `glmer` Laplace fits,
so we skip `gmod_lme4_agq` in this summary.)

The pooled `glm()` and `glmer()` fits have identical
log-likelihoods, as expected (when the random-effects
variance collapses to 0, `glmer()` is essentially
fitting a pooled model): the `glmer()` fit is
AIC-penalized for an additional parameter (the among-site
variance).  The fixed-effect `glm()` fit has a slightly
better log-likelihood, but not enough to make up for 
9 additional `Site` effect parameters.

Another option in this case is to use `bglmer` from the `blme` package,
which sets a weak prior on the variance to push it away from zero:
```{r blmer1,cache=TRUE}
gmod_blmer_L <- bglmer(shells~prev+offset(log(Area))+factor(year)+(1|Site),
      family=poisson,data=Gdat)
```

```{r blmer_CI,eval=FALSE)
gmod_blmer_prof <- confint(gmod_blmer_L)
```
```{r blmer_CI2,cache=TRUE,message=FALSE}
gmod_blmer_bootCI <- confint(gmod_blmer_L,method="boot")
```

Now we do estimate a positive (but small) among-site variance:
```{r blmer_vc}
VarCorr(gmod_blmer_L)
```

Check for overdispersion: you can do this by hand by computing
`sum(residuals(gmod_lme4_L,"pearson")^2))`, but the `gof()` function
from the `aods3` package is a handy shortcut (it computes overdispersion
based on the deviance (`D` below) and Pearson residuals (`X2`): when
they disagree, use the latter:
```{r checkdisp}
gof(gmod_lme4_L)
```
The sum of squared Pearson residuals is less than
the residual degrees of freedom, so the response
is actually underdispersed.

Just for comparison's sake, we'll fit a model
with an observation-level random effect (logistic-normal
model) anyway:
```{r od}
Gdat$obs <- factor(seq(nrow(Gdat)))
gmod_lme4_L_od <- glmer(shells~prev+offset(log(Area))+factor(year)+(1|Site)+(1|obs),
      family=poisson,data=Gdat)
```

#### glmmADMB
As before, the only thing we have to change for `glmmADMB` is to
specify the family in quotation marks (`family="poisson"`):

```{r gopher_glmmADMB,cache=TRUE}
gmod_gA_L <- glmmadmb(shells~prev+offset(log(Area))+factor(year)+(1|Site),
      family="poisson",data=Gdat)
```

If we want, we have the alternative of fitting a negative-binomial
model, either "type 1" ($\text{Var} = \phi \mu$) or
"type 2" ($\text{Var} = \mu (1 + \mu/k)$):
```{r gopher_glmmADMB2,cache=TRUE}
gmod_gA_L_NB2 <- glmmadmb(shells~prev+offset(log(Area))+factor(year)+(1|Site),
      family="nbinom",data=Gdat)
gmod_gA_L_NB1 <- glmmadmb(shells~prev+offset(log(Area))+factor(year)+(1|Site),
      family="nbinom1",data=Gdat)
```

Unsurprisingly (since we know the data are underdispersed),
the negative binomial model fits don't decrease the AIC:
```{r g_AICtab}
AICtab(gmod_gA_L,gmod_gA_L_NB1,gmod_gA_L_NB2)
```
**FIXME**: why is delta-AIC > 2 ???

#### MCMCglmm

As in the previous example, it turns out we have to run
`MCMCglmm` chains for longer, and specify reasonably
strong priors, in order to get reasonable results.

```{r gmod_MG0,cache=TRUE}
gmod_MG0 <- MCMCglmm(shells~prev+offset(log(Area))+factor(year),
         random=~Site,
      family="poisson",data=Gdat,verbose=FALSE)
```

Run for longer:

```{r gmod_MG,cache=TRUE}
gmod_MG <- MCMCglmm(shells~prev+offset(log(Area))+factor(year),
         random=~Site,
      family="poisson",data=Gdat,verbose=FALSE,nitt=1e5+3e3,thin=1e2)
```

Informative priors (see discussion above; here we set even
stronger priors, with equal variances among observations
and across site):

```{r gmod_MG2,cache=TRUE}
gmod_MG2 <- MCMCglmm(shells~prev+offset(log(Area))+factor(year),
         random=~Site,
       prior=list(R=list(nu=5,V=5),
                  G=list(list(nu=5,V=5))),
      family="poisson",data=Gdat,verbose=FALSE,nitt=1e5+3e3,thin=1e2)
```

### Diagnostics

The standard diagnostic plot is a little better than
in the binary case, but not much:
```{r g_diag1}
plot(gmod_lme4_L)
```

```{r g_diag2}
plot(gmod_lme4_L,Site~resid(.))
```

However, it turns out that those boxplots are really only representing
three values per site (i.e., site $\times$ year combinations): we
can see this if we use `ggplot` to overlay points on the boxplots.
While we're at it, we might as well reorder the `Site` factor to
be in order of mean residuals. (**WARNING**: if you have loaded the
`scapeMCMC` package to get prettier MCMC diagnostic plots such
as `plotTrace`, you have also incidentally loaded the `reorder.factor`
function from the `gdata` package, which changes the default behaviour
of the `reorder` function, and there is a subtle bug in it as well.
You need to make sure to specify `FUN=mean` *and* `sort=sort`
explicitly to make it work ...)
```{r g_diag3}
ff <- fortify(gmod_lme4_L)
ff <- transform(ff,Site=reorder(Site,X=.resid,FUN=mean,sort=sort))
ggplot(ff,aes(x=Site,y=.resid))+geom_boxplot()+
    geom_point(size=4,alpha=0.5)+
    coord_flip()
```
There seems to be a fair amount of among-site variation, but apparently
(at least according to `glmer`) this amount of among-site variation
is still consistent with Poisson variation among otherwise identical
sites ...

We have already checked for overdispersion, above, and found
the residuals to be under- rather than overdispersed.

We can do *posterior predictive simulations* to test
whether the model is behaving like the data in other ways.
For example, looking at the distributions of the numbers
of zero-shell outcomes in simulated data:
```{r plotsims}
sims <- simulate(gmod_lme4_L,nsim=1000)
nzeros <- colSums(sims==0)
par(las=1,bty="l")
plot(pt <- prop.table(table(nzeros)),
     ylab="Probability")
(obszero <- sum(Gdat$shells==0))
points(obszero,0.13,col="red",pch=16,cex=2)
```

A two-sided $p$-value, if we felt one were necessary:
```{r simtest}
sum(pt[names(pt) %in% c(4:9,13:18)])
```

We can easily do this for other characteristics of the data that
we were interested in, such as among-site variance.
In this case we will use the `glm()` fit to simulate and re-fit,
since it's much faster than `glmer()` and suitable for this
task:

```{r simvars,cache=TRUE}
sims2 <- simulate(gmod_glm,nsim=1000)
vfun <- function(x) {
    m_new <- update(gmod_glm,data=transform(Gdat,shells=x))
    Gdat$.resid <- residuals(m_new,"pearson")
    sitemeans <- ddply(Gdat,"Site",summarise,mresid=mean(.resid))
    var(sitemeans$mresid)
}
vdist <- sapply(sims2,vfun)
```

```{r comp_vdist}
Gdat$.glmresid <- residuals(gmod_glm,"pearson")
obs_sitemeans <- ddply(Gdat,"Site",summarise,mresid=mean(.glmresid))
obs_sitevar <- var(obs_sitemeans$mresid)
par(las=1,bty="l")
hist(vdist,breaks=30,col="gray",freq=FALSE,main="",
     xlab="Among-site variance in residuals")
par(xpd=NA) ## prevent point getting cut off at top of plot
points(obs_sitevar,3.1,col="red",pch=16,cex=2)
```

The observed among-site variance is almost exactly what we
would expect from Poisson variation with no true site-to-site
variation.

#### MCMCglmm

We write a small utility function to automate the 
process of combining the fixed-effect (`Sol`) and
variance-covariance (`VCV`) parameter chains and
making the trace plot:
```{r tfun}
tfun <- function(mod) {
    plotTrace(as.mcmc(cbind(mod$Sol,mod$VCV)))
}
```

As suggested above, the default results are again bad.
```{r gtrace0}
tfun(gmod_MG0)
```

Running for longer doesn't help that much:
```{r gtrace1}
tfun(gmod_MG)
```

Looks OK with somewhat informative priors:
```{r gtrace2}
tfun(gmod_MG2)
```

**FIXME**: illustrate prior/posterior densities?
(figure out parameters of corresponding Gamma distribution?
```{r checkprior,eval=FALSE,echo=FALSE}
plot(density(1/gmod_MG2$VCV[,1]))
```
### Results

```{r gmod_confint,cache=TRUE,warning=FALSE}
gmod_CIwald <- confint(gmod_lme4_L,method="Wald")
gmod_CIprof <- confint(gmod_lme4_L,quiet=TRUE)
gmod_CIboot <- confint(gmod_lme4_L,method="boot",quiet=TRUE)
```

```{r assemble_gmod,echo=FALSE}
g0 <- setNames(coeftab(gmod_lme4_L)[,c("Estimate","2.5%","97.5%")],
              c("est","lwr","upr"))               
g1 <- rbind(g0,
            data.frame(est=sqrt(unlist(VarCorr(gmod_lme4_L))),
                       lwr=NA,upr=NA))
g1_prof <- g1
g1_prof[,2:3] <- gmod_CIprof[c(2:5,1),]
g1_boot <- g1
g1_boot[,2:3] <- gmod_CIboot[c(2:5,1),]
ff <- function(x,CI) {
    data.frame(var=rownames(x),x,CI=CI)
}
g2 <- do.call(rbind,mapply(ff,list(g1,g1_prof,g1_boot),
                           list("Wald","profile","boot"),SIMPLIFY=FALSE))
rownames(g2) <- NULL
g2 <- data.frame(g2,fun="glmer")
g3 <- rbind(setNames(data.frame(coef(gmod_gA_L),confint(gmod_gA_L)),
                     c("est","lwr","upr")),
            Site=data.frame(est=unname(sqrt(gmod_gA_L$S$Site)),
                       lwr=NA,
                       upr=unname(sqrt(gmod_gA_L$S$Site+
                       1.96*gmod_gA_L$sd_S$Site))))
g3 <- data.frame(var=rownames(g3),g3,CI="Wald",fun="glmmADMB")
ss <- summary(gmod_MG)
cc <- c("post.mean","l-95% CI","u-95% CI")
g4 <- rbind(setNames(data.frame(rownames(ss$solutions),
                       ss$solutions[,cc]),
                                c("var","est","lwr","upr")),
           setNames(data.frame(var=rownames(ss$Gcovariances),
                               sqrt(ss$Gcovariances[,cc,drop=FALSE])),
                               c("var","est","lwr","upr")))
g4 <- data.frame(g4,CI="MCMC",fun="MCMCglmm")
g5 <- setNames(coeftab(gmod_blmer_L)[,c("Estimate","2.5%","97.5%")],
              c("est","lwr","upr"))               
g5B <- rbind(g5,
            data.frame(est=sqrt(unlist(VarCorr(gmod_blmer_L))),
                       lwr=NA,upr=NA))
g5C <- data.frame(var=rownames(g5B),g5B,CI="Wald",fun="blmer")
g6 <- setNames(coeftab(gmod_glm2)[,c("Estimate","2.5%","97.5%")],
              c("est","lwr","upr"))[1:4,]
g6B <- data.frame(var=rownames(g6),g6,CI="Wald",fun="glm")
gmod_Results <- rbind(g2,g3,g4,g5C,g6B)
```

```{r gmod_plotResults,echo=FALSE,warning=FALSE}
ggplot(subset(gmod_Results,var != "(Intercept)"),
       aes(y=interaction(CI,fun),x=est,xmin=lwr,xmax=upr,
           colour=fun,linetype=CI))+
    geom_linerangeh()+
    geom_point()+
    facet_wrap(~var,scale="free_x")+
    geom_vline(xintercept=0,lwd=1,alpha=0.3)+
    expand_limits(x=0)+
    scale_y_discrete(labels=rep("",5))+
    labs(x="",y="")
```

I've plotted each of the parameters in a separate facet because
their scales are somewhat different (it might also be worth
scaling `prev` by its standard deviation in order to make comparisons,
as suggested by Schielzeth 2010, but the `Site` variable would still
differ somewhat).

* the year effects (which are of least interest) are fairly consistent
among all methods, although `MCMCglmm` interprets the year-2005 parameter
(i.e., the difference between 2004 and 2005) as being significantly
different from zero.
* the effect of prevalence (which is of most interest) is similarly
consistent, with an estimated effect of around 2% increase in shells
per percentage seroprevalence. The `blmer` result, which forces
a slightly positive among-site variance, has a slightly wider confidence
interval; `MCMCglmm` wider still; and the fixed-effect model (i.e., `glm` with `Site` included as a factor) has a quite wide confidence interval -- the
latter is not surprising, as this model is quite overfitted (13 parameters
for only 30 observations).
* The estimated among-`Site` standard deviations are mostly zero (except
for `MCMCglmm` and `blmer`), but the confidence intervals vary widely; profile confidence intervals are the most conservative, followed by `MCMCglmm` and parametric bootstrap (it's not so easy to get the Wald confidence intervals on the random effect variances from the other models).
**FIXME**: *fix, or mention, that `glmmADMB` gives Wald CI on variance, but these include negative values and so break stuff when we try to transform to the square-root scale ...*

## Grouse ticks

```{r tick0}
tickdata = read.table("Elston2001_tickdata.txt",header=TRUE,
colClasses=c("factor","numeric","factor","numeric","factor","factor"))
```

```{r tickplot1}
ggplot(tickdata,aes(x=HEIGHT,y=1+TICKS,colour=YEAR))+
    stat_sum(aes(size=..n..),alpha=0.7)+
    scale_y_log10()+
    scale_size_continuous(breaks=c(2,6,10),range=c(2,7))+
    geom_smooth(method="glm",family=quasipoisson)
```    

The quasi-Poisson GLM fits shown here ignore the grouping structure,
but are useful for structuring the data (although the `HEIGHT`
effect is of secondary interest here -- we're mostly interested
in the relative variances at the individual, brood, and location
levels).

Center the height data (as recommended by 
Gelman and Hill 2006, Schielzeth 2010)
otherwise we will have trouble with both optimization- and
MCMC-based approaches.
```{r cheight}
tickdata <- transform(tickdata,cHEIGHT=HEIGHT-mean(HEIGHT))
```

Because `INDEX`, `BROOD`, and `LOCATION` are uniquely labeled,
we can specify the random effect as *either*
`(1|LOCATION)+(1|BROOD)+(1|INDEX)` or 
`(1|LOCATION/BROOD/INDEX)`, although the latter might be clearer:
```{r t_lme4fit,cache=TRUE}
tmod_lme4_L <- glmer(TICKS~YEAR+cHEIGHT+(1|LOCATION/BROOD/INDEX),
                    family="poisson",data=tickdata)
```
```{r tsum1}
print(summary(tmod_lme4_L),corr=FALSE)
```
This all looks reasonable; centering the height has given
us a sensible intercept (otherwise it would correspond,
ridiculously, to the expected number of ticks per grouse at
sea level), the other parameters are sensible, and the
variance is approximately equally divided between random-effects
levels (when we work with variance decomposition it makes sense
to interpret the random effects on the variance rather than
the standard deviation scale).

```{r t_gAfit,cache=TRUE}
tmod_gA_L <- glmmadmb(TICKS~YEAR+cHEIGHT+(1|LOCATION/BROOD/INDEX),
                      family="poisson",data=tickdata)
```

Since `MCMCglmm` automatically adds an observation-level random
effect, we specify only `BROOD` and `LOCATION`, leaving out `INDEX`:
```{r t_MG_fit,cache=TRUE}
tmod_MG <- MCMCglmm(TICKS~cHEIGHT+YEAR,
                 random=~BROOD+LOCATION,
                 family="poisson",data=tickdata,
                 verbose=FALSE)
```

Once again we will need to try this with an informative prior:
```{r t_MG_fit2,cache=TRUE}
tmod_MG2 <- MCMCglmm(TICKS~cHEIGHT+YEAR,
                 random=~BROOD+LOCATION,
                prior=list(R=list(nu=0,V=1),
                  G=list(LOCATION=list(nu=5,V=1),
                                  BROOD=list(nu=5,V=1))),
                 family="poisson",data=tickdata,
                 verbose=FALSE)
```

For comparison, and because Elston et al. originally used
penalized quasi-likelihood to fit their models, we use
`MASS::glmmPQL` to fit the model.
We don't include `INDEX` in this model either, because
`glmmPQL` also includes a residual error term by default.
```{r t_PQL,cache=TRUE}
tmod_PQL <- glmmPQL(TICKS~cHEIGHT+YEAR,
                    random=~1|LOCATION/BROOD,
                    family="poisson",data=tickdata,
                    verbose=FALSE)
```

### Diagnostics

The diagnostic plot is clearer if we plot the fitted values
on the log scale:
```{r t_residplot0}
plot(tmod_lme4_L,residuals(.) ~log(fitted(.)))
```

`ggplot(fortify(.))` puts the fitted values on the log
scale by default.  Here is a scale-location plot, which
shows some tendency of the variance of the residuals to
decrease with the mean:
```{r t_residplot1}
ggplot(fortify(tmod_lme4_L),
       aes(x=.fitted,y=sqrt(abs(.scresid))))+geom_point()+
    geom_smooth(colour="red",alpha=0.3)
```
**FIXME**: *consider a NB1 model with `glmmADMB`?  Write
`fortify` methods for `glmmADMB`?

We can look at the residuals grouped by location:
```{r t_residplot2}
plot(tmod_lme4_L,LOCATION~resid(.,type="pearson"))
```

`ggplot` makes it a little bit easier to re-order the locations
by mean residual:
```{r t_residplot3}
ff <- fortify(tmod_lme4_L)
ff <- transform(ff,LOCATION=reorder(LOCATION,.resid,fun=MEAN,sort=sort))
ggplot(ff, aes(x=LOCATION,y=.resid))+geom_boxplot(fill="gray")+coord_flip()
```

We can also look at the conditional modes:
```{r t_ranef_dotplot,fig.width=10}
dd <- dotplot(ranef(tmod_lme4_L,condVar=TRUE))
do.call(grid.arrange,c(dd,list(nrow=1)))
```

As we have come to expect, the trace plots for the
default `MCMCglmm` run are a bit problematic:
```{r tmod_MG_diag1}
tfun(tmod_MG)
```

The adjusted run is a bit better:
```{r tmod_MG_diag2}
tfun(tmod_MG2)
```

We can also look at the pairwise distributions, which look OK
(we focus on the variance parameters here):
```{r splom1}
plotSplom(tmod_MG2$VCV,pch=".")
```
(using `pch="."` can be effective when you are looking at large samples).

Or we can look at the density plots
```{r densityplot1,fig.height=4}
plotDens(tmod_MG$VCV,from=0,layout=c(3,1),asp="fill",las=1)
``` 

```{r t_lme4CIprof,cache=TRUE,warning=FALSE}
pp <- profile(tmod_lme4_L)
tmod_lme4_ciprof <- confint(pp)
```

```{r t_lme4ciboot,cache=TRUE,warning=FALSE}
tmod_lme4_ciboot <- confint(tmod_lme4_L,method="boot",
                            nsim=500,quiet=TRUE,seed=101)
```

```{r assemble_tmod,echo=FALSE}
ttfun <- function(cc,lab) {
    data.frame(var=rownames(cc),
               est=cc[,"Estimate"],
               lwr=cc[,"2.5%"],
               upr=cc[,"97.5%"],
               fun=lab)
}
c4 <- ttfun(coeftab(tmod_lme4_L,ptype="vcov"),"glmer")
c4$var <- c("INDEX","BROOD","LOCATION")
c4[,c("lwr","upr")] <- tmod_lme4_ciprof[1:3,]^2
c4$CI <- "profile"
c4B <- c4
c4B[,c("lwr","upr")] <- tmod_lme4_ciboot[1:3,]^2
c4B$CI <- "boot"
cgA <- ttfun(coeftab(tmod_gA_L,ptype="vcov"),"glmmADMB")
cgA$var <- c("LOCATION","BROOD","INDEX")
cgA$CI <- "Wald"
cMG2 <- ttfun(coeftab(tmod_MG2,ptype="vcov"),"MCMCglmm")
cMG2$var <- c("BROOD","LOCATION","INDEX")
cMG2$CI <- "MCMC"
cPQL <- do.call(rbind,intervals(tmod_PQL)$reStruct)
cPQL <- rbind(cPQL,INDEX=data.frame(lower=NA,est.=tmod_PQL$sigma,upper=NA))
cPQL <- setNames(cPQL,c("lwr","est","upr"))
cPQL$var <- rownames(cPQL)
cPQL$fun <- "glmmPQL"
cPQL$CI <- "Wald"
cmod_All <- rbind(c4,c4B,cgA,cMG2,cPQL)
```

```{r t_coefs,echo=FALSE,warning=FALSE}
ggplot(cmod_All,aes(x=var,y=est,ymin=lwr,ymax=upr))+
    geom_pointrange(position=position_dodge(width=0.5),
                    aes(colour=fun,lty=CI))+
    coord_flip()
```


```{r predframe}
## watch out!
pframe <- data.frame(ttt=factor(levels(culcita_dat$ttt),
                                levels=levels(culcita_dat$ttt)))
predict(cmod_lme4_L,re.form=NA,newdata=pframe,type="response")
```

```{r cmod_bootpreds,cache=TRUE}
set.seed(101)
bb <- bootMer(cmod_lme4_L,
              FUN=function(x)
              predict(x,re.form=NA,newdata=pframe,type="response"),
              nsim=500)
```

```{r culcbootci}
t(sapply(1:4,
       function(i)
         boot.ci(bb,type="perc",index=i)$percent[4:5]))
```

```{r culcpbsetup}
culcita_dat <- transform (culcita_dat,
        symb=ifelse(ttt=="none",0,1))
cmod_fit2 <- update(cmod_lme4_L,
            .~symb+(1|block))
anova(cmod_lme4_L,cmod_fit2)
library(pbkrtest)
simfun <- function() {
  x <- simulate(cmod_fit2)
  m1 <- try(refit(cmod_lme4_L,x[[1]]),silent=TRUE)
  m2 <- try(refit(cmod_fit2,x[[1]]),silent=TRUE)
  if (inherits(m1,"try-error") || inherits(m2,"try-error")) NA else
  c(-2*(logLik(m1)-logLik(m2)))
}
set.seed(101)
```
```{r pbcmodcomp,eval=FALSE}
refdist <- replicate(400,simfun())
```

```{r misc}
with(as.data.frame(tmod_MG2$VCV),
     mean(BROOD>units))
```

```{r misc2}
with(as.data.frame(tmod_MG2$VCV),
     summary(BROOD/units))
with(as.data.frame(tmod_MG2$VCV),
     quantile(BROOD/units,c(0.025,0.975)))
```

```{r cmodcomp}
obsval <- -2*(logLik(cmod_lme4_L)-logLik(cmod_fit2))
mean(c(refdist,obsval)<=obsval,na.rm=TRUE)
```

## TO DO
* finish tick example
* check with latest `lme4` version
* add model comparison (`drop1`, `PBmodcomp`, etc.) examples
* fix up `MCMCglmm` according to J. Hadfield's suggestions
* address complete separation in *Culcita* example, via `blme` or `MCMCglmm` priors?
* add tundra example?
* cAIC example??

## References

## JUNK

```{r junk1}
options(digits=3)
coef(summary(gmod_lme4_L))["prev",]
coef(summary(gmod_gA_L))["prev",]
summary(gmod_MG)$solutions["prev",]
```