**Excerpt **from Bolker, B. M., 2015. Linear and generalized linear
mixed models. In G. A. Fox, S. Negrete-Yankelevich,

and V. J. Sosa (eds.), *Ecological Statistics: Contemporary theory
and application*. Oxford University Press.

ISBN 978-0-19-967255-4. In press.

https://global.oup.com/academic/product/ecological-statistics-9780199672547?cc=ca&lang=en&

** **

**Random effects**

** **

The traditional view of random
effects is as a way to do correct statistical tests when some observations are
correlated. When samples are collected in groups (within sites in the tundra example
above, or within experimental blocks of any kind), we violate the assumption of
independent observations that is part of most statistical models. There will be
some variation within groups (s^{2}_{within})
and some among groups (s^{2}_{among});
the total variance is s^{2}_{total}=s^{2}_{within}+s^{2}_{among};
and therefore the correlation between any two observations in the same group is
(observations that come from *different* groups are uncorrelated). Sometimes
one can solve this problem easily by taking group averages. For example, if
were testing for differences between deciduous and evergreen trees, where every
member of a species has the same leaf habit, we could simply calculate speciesŐ
average responses, throwing away the variation within species, and do a *t-*test between the deciduous and
evergreen species means. If the data are balanced (i.e., if we sample the same
number of trees for each species), this procedure is exactly equivalent to
testing the fixed effect in a classical mixed model ANOVA with a fixed effect
of leaf habit and a random effect of species. This approach correctly
incorporates the facts that (1) repeated sampling within species reduces the
uncertainty associated with within-group variance, but (2) we have fewer *independent* data points than
observations – in this case, as many as we have groups (species) in our
study.

These basic ideas underlie all classical mixed model ANOVA analyses, although the formulas get more complex when treatments vary within grouping variables, or when different fixed effects vary at the levels of different grouping variables (e.g., randomized-block and split-plot designs). For simple nested designs, simpler approaches like the averaging procedure described above are usually best (Murtaugh 2007). However, mixed model ANOVA is still extremely useful for a wide range of more complicated designs, and as discussed below, traditional mixed model ANOVA itself falls short for cases such as unbalanced designs or non-Normal data.

We can also think of random
effects as a way to combine information from different levels within a grouping
variable. Consider the tundra ecosystem example, where we want to estimate
linear trends (slopes) across time for many sites. If we had only a few years
sampled from a few sites, we might have to *pool*
the data, ignoring the differences in trend among sites. Pooling assumes that s^{2}_{among} (the variance in slopes among
sites) is effectively zero, so that the individual observations are
uncorrelated (r=0). On the other hand, if we had many years
sampled from each site, and especially if we had a small number of sites, we might
want to estimate the slope for each site individually, or in other words to
estimate a fixed effect of time for each site. Treating the grouping factor (site)
as a fixed effect assumes that information about one site gives us no
information about the slope at any other site; this is equivalent, for the
purposes of parameter estimation, to treating s^{2}_{among}
as infinite. Treating site as a random effect compromises between the extremes
of pooling and estimating separate (fixed) estimates; we acknowledge, and try
to quantify, the variability in slope among sites. Because the trends are
assumed to come from a population (of slopes) with a well-defined mean, the predicted
slopes in CO_{2} flux for each site are a weighted average between the trend
for that site and the overall mean trend across all sites; the smaller and
noisier the sample for a particular site, the more its slope is compressed toward
the population mean (Figure 13.1). For technical reasons, these values (the deviation
of each siteŐs value from the population average) are called **conditional modes**, rather than *estimates*. The conditional modes are also
sometimes called *random effects*, but
this could also refer to the grouping variables (the sites themselves, in the
tundra example). Confusingly, both the conditional modes and the estimates of
the among-site variances can be considered parameters of the random effects
part of the model. For example, if we had independently estimated the trend at
one site (i.e. as a fixed effect) as -5 grams
C/m^{2}/year, with an estimated variance of 1, while the mean rate of
all the sites was -8 g C/m^{2}/year with an among-site variance of 3,
then our predicted value for that site would be (m_{site}/*s*^{2}_{within}
+ m_{overall}/*s*^{2}_{among})/ (1/*s*^{2}_{within} + 1/*s*^{2}_{among})
= (‑5/1+-8/3)/(1/1+1/3)=-5.75 g C/m^{2}/year. Because *s*^{2}_{within}<*s*^{2}_{among} -- the trend estimate for
the site is relatively precise compared to the variance among sites -- the random-effects
prediction is closer to the site-specific value than to the overall mean. (Stop
and plug in a few different values of among-site variance to convince yourself
that this formula agrees with verbal description above of how variance-weighted
averaging works when s^{2}_{among} is either very small or very large relative
to s^{2}_{within}.)

Random effects are especially useful when we have (1) lots of levels (e.g. many species or blocks), (2) relatively little data on each level (although we need multiple samples from most of the levels), and (3) uneven sampling across levels (Box 13.1).

** **

Frequentists and Bayesians define
random effects somewhat differently, which affects the way they use them. Frequentists
define random effects as categorical variables whose levels are chosen *at random* *from a larger population*, e.g. species chosen at random from a list
of endemic species. Bayesians define random effects as sets of variables whose
parameters are drawn from a distribution. The frequentist definition is
philosophically coherent, and you will encounter researchers (including
reviewers and supervisors) who insist on it, but it can be practically problematic.
For example, it implies that you canŐt use species as random effect when you
have observed *all *of the species at
your field site -- since the list of species is not a sample from a larger
population -- or using year as a random effect -- since researchers rarely run
an experiment in randomly sampled years: they usually use either a series of
consecutive years, or the haphazard set of years when they could get into the
field. This problem applies to both the gopher tortoise and tick examples, each
of which use data from consecutive years.

BOX 13.1.

You may want to treat a predictor variable as a random effect if you:

á donŐt want to test hypotheses about differences between responses at particular levels of the grouping variable;

á do want to quantify the variability among levels of the grouping variable;

á do want to make predictions about unobserved levels of the grouping variable;

á do want to combine information across levels of the grouping variable;

á have variation in information per level (number of samples or noisiness);

á have levels that are randomly sampled from/representative of a larger population.

á have a categorical predictor that is a nuisance variable (i.e. it is not of direct interest, but should be controlled for)

cf. Crawley (2002), Gelman (2005)

If you have sampled fewer than 5 levels of the grouping variable, you should strongly consider treating it as a fixed effect even if one or more of the criteria above apply.

Random effects can also be
described as predictor variables where you are interested in making inferences
about the distribution of values (i.e., the variance among the values of the
response at different levels) rather than in testing the differences of values
between particular levels. Choosing a random effect trades the ability to test
hypotheses about differences among particular levels (low vs. high nitrogen, 2001
vs. 2002 vs. 2003) for the ability to (1) quantify the variance among levels
(variability among sites, among species, etc.) and (2) generalize to levels
that were not measured in your experiment. If you treat species as a fixed
effect, you canŐt say anything about an unmeasured species; if you use it as a
random effect, then you can guess that an unmeasured species will have a value
equal to the population mean estimated from the species you did measure. Of
course, as with all statistical generalization, your levels (e.g. years) must
be chosen in some way that, if not random, is at least *representative* of the population you want to generalize to.

People sometimes say that random effects are Ňfactors that you arenŐt interested inÓ. This is not always true. While it is often the case in ecological experiments (where variation among sites is usually just a nuisance), it is sometimes of great interest, for example in evolutionary studies where the variation among genotypes is the raw material for natural selection, or in demographic studies where among-year variation lowers long-term growth rates. In some cases fixed effects are also used to control for uninteresting variation, e.g. using mass as a covariate to control for effects of body size.

You will also hear that Ňyou
canŐt say anything about the (predicted) value of a conditional mode.Ó This is
not true either – you canŐt formally test a null hypothesis that the
value is equal to zero, or that the values of two different levels are equal,
but it is still perfectly sensible to look at the predicted value, and even to
compute a standard error of the predicted value (e.g. see the error bars around
the conditional modes in Figure 13.1). Particularly in management contexts,
researchers may care very much about *which*
sites are particularly good or bad relative to the population average, and how much
better or worse they are than the average. Even though itŐs difficult to
compute formal inferential summaries such as *p*-values, you can still make common-sense statements about the
conditional modes and their uncertainties.

The Bayesian framework has a
simpler definition of random effects. Under a Bayesian approach, a fixed effect
is one where we estimate each parameter (e.g. the mean for each species within
a genus) independently (with independently specified priors), while for a
random effect the parameters for each level are modeled as being drawn from a distribution
(usually Normal); in standard statistical notation, species_mean ~ Normal(genus_mean,
s^{2}_{species}).

I said above that random effects are most useful when the grouping variable has many measured levels. Conversely, random effects are generally ineffective when the grouping variable has too few levels. You usually canŐt use random effects when the grouping variable has fewer than 5 levels, and random effects variance estimates are unstable with fewer than 8 level, because you are trying to estimate a variance from a very small sample. In the classic ANOVA approach, where all of the variance estimates are derived from simple sums-of-squares calculations, random effects calculations work as long as you have at least two samples (although their power will be very low, and sometimes you can get negative variance estimates). In the modern mixed modeling approach, you tend to get warnings and errors from the software instead, or estimates of zero variance, but in any case the results will be unreliable (section 13.5 offers a few tricks for handling this case). Both the gopher tortoise and grouse tick examples have year as a categorical variable that would ideally be treated as random, but we treat it as fixed because there are only three years sampled: treating years as a random effect would most likely estimate the among-year variance as zero.