I will be teaching a Biostatistics course in the spring. The course web page will be at
http://www.math.mcmaster.ca/bolker/classes/s756. The time and location are
Mon/Thurs 1:00-2:30, HH (Hamilton Hall) 207.
Other course info
I will focus on practical (but advanced) statistical techniques useful for population-level biology,
i.e. ecology, evolution, and infectious disease epidemiology.
The course will primarily use R.
Primary topics include:
- data manipulation and visualization
- review of generalized linear models, and extensions such as models of
overdispersion (e.g. negative binomial, lognormal-Poisson) and zero-inflation
- mixed models: classical (review of nested/split-plot/etc.); ’modern’ linear mixed
models; and generalized linear mixed models
Pair or group projects will be a large component of the class.
We will use Faraway (2006) as a textbook: I haven’t gone through it in great detail, but it
covers a sensible range of topics and should be a good resource to fall back on. However, I
expect the course will go well beyond this (with primary literature, scanned book chapters, and
notes as additional resources).
I would like to encourage both mathematicians and statisticians (with weak or absent biological
knowledge) and biologists (with solid statistical knowledge: see below) to take the course; one
hoped-for benefit of the course will be learning to communicate and work across disciplinary
- basic knowledge of R; data structures (vector, matrix, list), simple data
manipulation and summaries, basic plotting. If you don’t already know R but
are comfortable with programming and/or statistical computation you should
be able to be pick it up quickly. There are many introductory resources on
the web: http://www.math.mcmaster.ca/bolker/emdbook/lab1.pdf is a
good place to start your review
- solid background in basic statistical concepts and procedures (hypothesis tests,
t-test, ANOVA, regression)
- some familiarity with generalized linear models. Useful references/reminders:
- Wood (2006): the first few chapters give a terse but clear and self-contained
review of linear and generalized linear models
- Venables and Ripley (2002): also clear, but even more terse
- Crawley (2002): the most biologist-friendly on this list.
- some linear algebra would be quite helpful, but I will try to help the biologists in
the course understand what’s going on at a qualitative level
Having worked with real, messy data sets is not required but will be helpful.
I will make every effort to adapt the course to the interests and needs of those who sign up, but
here are some of the things I do not intend to cover in the course:
- basic statistics for biologists
- specialized approaches for phylogenetic inference
- specialized approaches for bioinformatics – microarray, sequence, SNP data
(although some concepts will carry over)
- the course will not be especially close to the catalog description (“classical
biostatistics”, i.e. analysis of multidimensional contingency tables; design and
analysis of clinical trials; etc.)
- analysis of time series/dynamical data (state space models etc.)
- I will probably not prove any theorems
As a general rule, in graduate courses I am primarily interested in seeing that students are
making a serious effort to engage the material; different students will learn different things in
this class depending on their background and interests.
Grading will be based on a combination of lab assignments (i.e. follow a prescribed set of R
code to explore a problem, then answer some related exercises) (30%); participation (informal +
’formal’ = class presentations/submitting discussion questions/leading discussion on
readings) (30%); and a group project (40%) which will be both written up and presented in
Crawley, M. J. (2002). Statistical Computing: An Introduction to Data Analysis using
S-PLUS. John Wiley & Sons.
Faraway, J. J. (2006). Extending Linear Models with R: Generalized Linear, Mixed Effects
and Nonparametric Regression Models. Chapman & Hall/CRC.
Venables and Ripley (2002). Modern Applied Statistics with S (4th ed.). New York:
Wood, S. N. (2006). Generalized Additive Models: An Introduction with R. Chapman &