\documentclass{article} \usepackage{sober} \newcommand{\code}[1]{{\tt #1}} \newcommand{\fixme}[1]{{\color{red}\textbf{#1}}} \title{Exploring \code{mcmcsamp} results} \author{Ben Bolker} \begin{document} \maketitle \SweaveOpts{out.width="\\textwidth",fig.align="center",tidy=FALSE,message=FALSE,warning=FALSE} <>= library(lme4.0) library(MCMCglmm) library(ggplot2) theme_update(theme_bw()) set.seed(1001) ## for mcmcsamp() reproducibility library(gridExtra) vc <- function(x) { invisible(print(lme4:::formatVC(VarCorr(x), useScale=TRUE), quote=FALSE)) } @ \section{Introduction} This document is an attempt at a quick (??) exploration of the behavior of \code{mcmcsamp}, and a comparison against the results from the same model fitted in \code{MCMCglmm} %% --- which, except for the priors and any coding mistakes on my part, should be fitting the same model For reference, the priors used in \code{MCMCglmm} (based on the description in \code{?MCMCglmm} are $N(\mu=0,\sigma^2=10^{10}\boldmath{I})$ for the fixed-effect parameters and $\mbox{Inv-Wishart}(\nu=0,\Sigma=\boldmath{I})$ for the random-effect parameters. In most of the cases below I would be surprised if the priors made a big difference, because these examples are generally well-behaved/well-defined cases where I would expect the data to be determining the variance components pretty strongly \ldots \section{Code} Functions to extract variances for fitted and MCMC variances. The functions will only work sensibly for problems with a single variance component per grouping factor, but these are also the only ones that \code{mcmcsamp} can handle, so we should be safe \ldots ? These codes started out trying to be simple but ending up getting more complicated as I tried to make them more general \ldots <>= get_pvars <- function(object) { v <- c(unlist(VarCorr(object)),units=sigma(object)^2) names(v) <- c(names(getME(object,"theta")),"units") v } get_mcvars <- function(object,mcmc,mcmcglmm=NULL) { ## get pvars, convert to 1-row data frame pvars <- as.data.frame(rbind(get_pvars(object))) ## get mcmc vars, drop fixed effects, rename nfix <- length(fixef(object)) m0 <- as.data.frame(mcmc,type="varcov")[,-(1:nfix)] mmvars <- setNames(m0,names(pvars)) L <- list(fitted=pvars,mcmc=mmvars) if (!is.null(mcmcglmm)) { m1 <- as.data.frame(mcmcglmm$VCV) ## re-arrange column names to match lmer regexstr <- "([[:alpha:]_\$\$]+)\\.([[:alpha:]_\$\$]+)" names(m1) <- gsub(regexstr,"\\2.\\1",names(m1)) f1 <- rbind(colMeans(m1)) ## combine mcmc and lmer results L$mcmc <- rbind(data.frame(method="mcmcsamp", step=seq(nrow(L$mcmc)), L$mcmc,check.names=FALSE), data.frame(method="MCMCglmm", step=seq(nrow(m1)),m1,check.names=FALSE)) L$fitted <- rbind(data.frame(method="mcmcsamp", L$fitted,check.names=FALSE), data.frame(method="MCMCglmm",f1,check.names=FALSE)) } L } @ While there certainly could be mistakes in these extraction functions, I have kept them as simple as possible, and the results match those printed by \code{summary()}, \code{VarCorr()}, etc., as well as the results of using \code{getME(.,"theta")} to extract the raw parameters (Cholesky components, which have to be multiplied by \code{sigma(.)} and squared to get the RE variances). Code for producing plots etc. is hidden; look at the original \code{Rnw} file if you want the gory details. <>= ## switch variables #1 and #3 for mcmcsamp ONLY switch_vars <- function(x,w="variable",v=c(1,3)) { ff <- function(i) { x[[w]]==levels(x[[w]])[v[i]] & x[["method"]]=="mcmcsamp" } v1 <- ff(1) v2 <- ff(2) tmp <- x[v2,"value"] x[v2,"value"] <- x[v1,"value"] x[v1,"value"] <- tmp x } do_plots <- function(v1,truevals=NULL,switchfun=identity,maxvals=NULL) { require(reshape2) require(ggplot2) mm1Mr <- mm1M <- melt(v1$mcmc,id.var=c("method","step")) if (!is.null(maxvals)) { ## split data frame and apply maxima as necessary ## don't know how to do this conveniently with plyr mm1Ms <- split(mm1M,mm1M$variable) for (i in levels(mm1M$variable)) { if (!is.na(maxvals[i])) { mm1Ms[[i]] <- subset(mm1Ms[[i]],value<=maxvals[i]) }} mm1Mr <- do.call(rbind,mm1Ms) } mm1F <- melt(v1$fitted) g_trace <- ggplot(switchfun(mm1M),aes(x=step,y=value,colour=factor(method)))+ geom_line(alpha=0.8)+ facet_wrap(~variable,scale="free",ncol=1)+ geom_hline(data=switchfun(mm1F), aes(yintercept=value,colour=factor(method)))+ opts(legend.position="bottom",nrow=1) ## don't know why colour aesthetic is required here ## geom_point(data=subset(mm1M,step==1),colour="blue") if (!is.null(truevals)) { g_trace <- g_trace + geom_hline(data=truevals, aes(yintercept=value), colour="purple") } g_dens <- ggplot(switchfun(mm1Mr), aes(x=value,colour=factor(method))) + geom_density()+ facet_wrap(~variable,scale="free",ncol=1)+ geom_vline(data=switchfun(mm1F), aes(xintercept=value, colour=factor(method)))+ geom_point(data=switchfun(subset(mm1M,step==1)),y=0)+ opts(legend.position="bottom",nrow=1) if (!is.null(truevals)) { g_dens <- g_dens + geom_vline(data=truevals, aes(xintercept=value),colour="purple") } list(g_trace,g_dens) } @ \section{Examples} \subsection{sleepstudy} <>= fm1 <- lmer(Reaction ~ Days + (1|Subject) + (0+Days|Subject), sleepstudy) vc(fm1) mm1 <- mcmcsamp(fm1,1000) fm1G <- MCMCglmm(Reaction ~ Days, random=~idh(1+Days):Subject, data=sleepstudy, verbose=FALSE) @ <>= v1 <- get_mcvars(fm1,mm1,fm1G) @ (The \code{MCMCglmm} expression could be written more simply as \verb!~Subject!: in general I am writing out the fully explicit random effects specification so that the variable names \code{MCMCglmm} uses match those given by \code{lmer}.) In these plots (and those to follow) the left-hand column shows the trace plots, the right the posterior densities; the red vertical and horizontal lines are the fitted values from the original model, the blue points (in all cases identical to the red lines) show the starting points of the MCMC chains. (In the one simulated example below, purple lines indicate the true variances.) The results for this first (\code{sleepstudy}) example seem reasonable, although the \code{mcmcsamp} results for the intercept term (first row) don't match \code{MCMCglmm} particularly well. (The fitted value from \code{lmer} does, so the problem seems to be less with \code{lmer} in general than with what \code{mcmcsamp} is doing \ldots) (first row) is quite close to the edge of its putative posterior distribution. <>= gg1 <- do_plots(v1) grid.arrange(gg1[[1]],gg1[[2]],ncol=2) @ We could produce a scatterplot matrix <>= splom(subset(v1$mcmc,select=-c(step,method)),pch=".") @ but it's not very informative in this case --- and I haven't yet gone to the trouble of superimposing the fitted values. (Hereafter the code is suppressed in the output, since it's practically the same.) <>= do_all <- function(fit,n=1000,MCMCglmmfit=NULL,truevals=NULL,...) { mm2 <- mcmcsamp(fit,n) v2 <- get_mcvars(fit,mm2,MCMCglmmfit) gg2 <- do_plots(v2,truevals,...) grid.arrange(gg2[[1]],gg2[[2]],ncol=2) } @ \subsection{Dyestuff} <>= fm2 <- lmer(Yield ~ 1|Batch, Dyestuff) fm2M <- MCMCglmm(Yield ~ 1,random=~idh(1):Batch, data=Dyestuff,verbose=FALSE) vc(fm2) @ This one looks OK too (I'm still slightly puzzled that the fitted value doesn't align with the modes of the distribution, but maybe that's a marginal vs. unconditional maximum value issue?) (Note that I have thrown away extreme values ($>10000$) of the batch variance in the density, for easier visualization.) The \code{MCMCglmm} results for the batch variance actually look a little wonky --- is the problem more poorly constrained than I think? <>= do_all(fm2,MCMCglmmfit=fm2M,maxvals=c("Batch.(Intercept)"=10000)) @ \subsection{Penicillin} <>= fm3 <- lmer(diameter ~ (1|plate) + (1|sample), Penicillin) fm3M <- MCMCglmm(diameter ~ 1,random=~idh(1):plate+idh(1):sample, data=Penicillin,verbose=FALSE) vc(fm3) @ Now it gets weird. I could very well be doing something dumb, but I have checked things multiple times and can't see it. It \emph{looks} like the series for the plate intercept and the residual variance could be switched (the mean of the plate intercept term MCMC density shown here is about 0.25, which fits pretty well with fitted residual variance; the MCMC density for the residual variance is $\approx 0.6$, which fits the fitted plate-intercept variance). (Restricting sample-intercept variances to $<20$ in the density.) <>= do_all(fm3,MCMCglmmfit=fm3M,maxval=c("sample.(Intercept)"=20)) @ If we switch the units and plate-intercept terms for the \code{mcmcsamp} things look slightly more reasonable (the \code{mcmcsamp} gets run again, so it won't be identical). <>= do_all(fm3,MCMCglmmfit=fm3M,maxval=c("sample.(Intercept)"=20), switchfun=switch_vars) @ Double-checking a relatively raw form with less possibility for accidental switching: <>= summary(as.data.frame(mcmcsamp(fm3,1000), type="varcov")) @ The other evidence that I haven't gotten something backward is that the MCMC chains do \emph{start} from the correct value that matches the fitted result --- and what's up with the sample intercept ??? \subsection{Pastes} <>= fm4 <- lmer(strength ~ (1|batch) + (1|sample), Pastes) fm4M <- MCMCglmm(strength ~ 1, random=~idh(1):batch + idh(1):sample, data=Pastes,verbose=FALSE) vc(fm4) @ Here, again, it seems as though the first and third components might be switched. The \code{MCMCglmm} fit for the batch intercept isn't mixing very well, so it's hard to see the density plot, but the horizontal line in the trace plot indicates here that \code{mcmcsamp} is actually doing a good job \ldots <>= do_all(fm4,MCMCglmmfit=fm4M) @ Try it with variables \#1 and \#3 switched \ldots <>= do_all(fm4,MCMCglmmfit=fm4M,switchfun=switch_vars) @ Comparing direct calculation: <>= colMeans(as.data.frame(mcmcsamp(fm3,1000), type="varcov"))[-1] @ \subsection{cake} <>= cake <- transform(cake,rr=interaction(recipe,replicate)) fm5 <- lmer(angle ~ recipe * temperature + (1|rr), cake) fm5M <- MCMCglmm(angle ~ recipe * temperature, random=~idh(1):rr, data=cake,verbose=FALSE) vc(fm5) @ <>= do_all(fm5,MCMCglmmfit=fm5M) @ Way off again. Switching variables doesn't really fix it: <>= do_all(fm5,MCMCglmmfit=fm5M,switchfun= function(x) switch_vars(x,v=1:2)) @ \subsection{A constructed example} From Laurent Stephane: <>= set.seed(666) sims <- function(I, J, sigmab0, sigmaw0){ Mu <- rnorm(I, mean=0, sd=sigmab0) y <- c(sapply(Mu, function(mu) rnorm(J, mu, sigmaw0))) data.frame(y=y, group=gl(I,J)) } I <- 20 # number of groups J <- 20 # number of repeats per group sigmab0 <- 1 # between standard deviation sigmaw0 <- 2 # within standard deviation dat <- sims(I, J, sigmab0, sigmaw0) @ How typical is this example/how much among-realization variability is there? Trying another example \ldots <>= fm5 <- lmer(y~(1|group),data=dat) fm5M <- MCMCglmm(y~1, random=~idh(1):group, data=dat, verbose=FALSE) vc(fm5) @ In this case the results look not-too-unreasonable --- although who knows what would happen if I used two random effects? <>= tvals <- data.frame(variable=factor(names(get_pvars(fm5))), value=c(1,4), method=rep("mcmcsamp",2), stringsAsFactors=FALSE) do_all(fm5,MCMCglmmfit=fm5M, truevals=tvals) @ <>= set.seed(667) dat2 <- sims(I, J, sigmab0, sigmaw0) fm5B <- lmer(y~(1|group),data=dat2) fm5MB <- MCMCglmm(y~1, random=~idh(1):group, data=dat2, verbose=FALSE) do_all(fm5B,MCMCglmmfit=fm5MB, truevals=tvals) @ To follow this up I should really run a large number of simulations and show the distribution of point estimates/distribution of HPD intervals or quantiles/coverage \ldots \section{The bottom line} It's still not clear to me whether the apparent differences between the fitted values from \code{lmer} and (1) the \code{mcmcsamp} results and (2) the \code{MCMCglmm} results are due to (a combination of): \begin{itemize} \item a bug in the \code{mcmcsamp} code that switches the order of the theta (random effects) parameters in some cases? \item differences in priors between \code{lmer} and \code{MCMCglmm} \item differences in the Bayesian answer (mean/HPD interval of marginal posterior distribution) and MLE answer \item per-realization variability (i.e. for the real data, the answer might look different for a different realization of the same system; for simulations, we can do this precisely \end{itemize} \section{Session info} <<>>= sessionInfo() @ \end{document}