Use statistical procedures from a range of schools and strictly adhere to their respective methods and interpretation. For example, do a Fisherian significance test properly and interpret it properly. Then set up a formal Neymann-Pearson (sic) test and interpret it formally (this means setting both Type I and II error rates beforehand, among other things). Then do an estimation procedure. Then switch hats and do a Bayesian analysis. Take the results of all four, noting their different behavior, and come to your conclusion. Good analysis and interpretation are as important as the fieldwork, so allot adequate time and resources to both.- Francis H. J. Crome, ``Researching Tropical Forest Fragementation: Shall We Keep On Doing What We're Doing?'' ch. 31 in Tropical Forest Remants: Ecology, Management and Conservation of Fragmented Communities (ed. W. F. Laurance and R. O. Bierregaard, Jr.), University of Chicago, Chicago: 1997 (p. 501).
Note: modern "bad" statistical practice as criticized by Yoccoz and Johnson essentially muddles these two schools.
To establish confidence limits, you calculate likelihood profiles (how likelihood drops off as you go away from the MLE).
Example: suppose we want to test whether a coin is fair (p(head = 0.5); we flip it 10 times and get 7 heads. The likelihood that the probability of getting a head is p is proportional to p7 (1-p)3 (assuming flips are independent) and the log-likelihood is C + 7 ln(p) + 3 ln(1-p).
![]() | ![]() | ![]() |
What can we tell?
Why should we use 1.92 log-likelihood units? The likelihood ratio test says that if we allow n parameters to vary, the q% confidence limits are (asymptotically) the q quantiles (upper tails) of c2n/2: c21(0.95)/2 = 1.92. Essentially, this is a frequentist argument sneaking back in that says how close we can expect to get to the true MLE in repeated trials.
Example: do two populations have the same mean?
Suppose we have two populations (assume we know they are normally distributed and that they have the same [known] variance s2, for simplicity). The normal distribution is ~ e-(x-m)2/(2 s2); therefore the joint log-likelihood of a set of values drawn independently from a normal distribution with mean m (the sum of the individual log-likelihoods) is µ -åi (xi-m)2 - the sum of squares. If we assume there is really just a single distribution from which all of the values are drawn the log likelihood is µ -åi (xi - [`X]); if we are allowed different means for the two different populations we get µ -åi (xi - [`X]1) - åi(xi-[`X]2). The likelihood for the second case can't be worse than the first (since we could always set [`X]1 = [`X]2 = [`X]). If it is more than 1.92 greater than the first, we can conclude that the two populations have ``significantly'' different means. We can also find confidence limits, etc..
Points about likelihood:
Remember, what would really like to know is P(H0|data), the probability of our hypothesis given the data we've observed.
Use Bayes' Rule (see below) to get probabilities on hypotheses and parameter values. In frequentist statistics, the underlying hypotheses/parameters/models are true and the data come from a probability distribution; in Bayesian statistics the data are true and the underlying hypotheses etc. have probability distributions.
(Bayesian decision analysis: associate costs with different outcomes, maximize expected return.)
Bayesian analysis
Fisher | N-P | Likelihood | Information | Bayes | |
depends (at least conceptually) on replicated outcomes | yes | yes | yes | no | no |
outcome depends on sampling rules (violates Likelihood principle) | yes | yes | no | no | no |
gives decision rules | no | yes | no | (no) | yes |
requires alternative hypotheses specified | no | yes | yes | yes | yes |
intuitive probability interpretation | no | no | yes | no | yes |
subjective | no | no | no | no | yes |
requires specified priors | no | no | no | no | yes |
allows integrating previous results | no | no | no | no | yes |
How can we figure out the probability of the hypothesis, P(H0|data) from the likelihood, P(data|H0)? (The likelihood P(data|H0) is a probability of data, not hypotheses: the likelihoods for all our candidate hypotheses don't even add up to 1, as they would if they were probabilities of different hypotheses ...)
How? Bayes' Rule.
| (1) |
In words: multiply the likelihood P(data|H0) by the prior probability P(H0), and divide by the sum of (likelihood × prior) for all candidate hypotheses.
False-positive/testing example
My favorite example (which I will probably not have time to present in class): suppose the probability of having some deadly but rare disease (D) is 10-4. There is a test for this disease which has no false negatives: if you have the disease, you will test positive (P(+|D) = 1). However, there are occasionally false positives; 1 person in 1000 who doesn't have the disease) will test positive anyway (P(+|not D) = 10-3). We want to know the probability that someone who has a positive test is actually ill.
Using Bayes' Rule
We know P(+|D) (=1) and P(D) ( = 10-4), but we have to figure out
P(+), the overall probability of testing positive.
You can test positive if you are
really diseased or if you are really healthy;
P(D|+) =
P(+|D) P(D)
P(+)
. (2)
according to the rule that if A and B are mutually exclusive
(you can't be both ill and not ill)
P(A and B) = P(A)+P(B).
We can then say
P(+) = (p(D and +)) + (p(not D and +)), (3)
by the rule that P(A and B) = P(A) P(B|A).
Putting it all together,
P(+) = P(D) P(+|D) + (1-P(D)) P(+|not D) (4)
Even though false positives are rare, the chance of being ill if you
test positive is still only 10%!
P(D|+)
=
P(+|D) P(D)
P(D) P(+|D) + (1-P(D))P(+|not D)
=
1 ×10-4
1 ×10-4 + (1-10-4) ×10-3
»
10-4
10-3
=
1
10
.
Priors If P(H0) is constant reduces to likelihood rule (except that we can say something about probability) but what does a "flat" prior mean? (scale changes, subdividing hypotheses)
How do you do this?
(Continue with previous examples: coin-flipping and testing two populations.)
Coin-flipping:
![]() | ![]() |
Two-population example:
Credible intervals (symmetric, contains 95% of probability).
Issues:
Final conclusions: