Comment from the Stata technical group
This text is a Stata-specific treatment of generalized linear mixed models,
also known as multilevel or hierarchical models. These models are "mixed" in
the sense that they allow fixed and random effects and are "generalized" in
the sense that they are appropriate not only for continuous Gaussian
responses but also for binary, count, and other types of limited dependent
variables.
Beginning with the comparatively simple random-intercept linear model without
covariates, the text develops the mixed model from first principles,
familiarizing the reader with terminology, summarizing and relating the
widely used estimating strategies, and providing historical perspective.
Once this mixed-model foundation has been established, the text smoothly
generalizes to random-intercept models with covariates and then to
random-coefficient models. The middle chapters of the text apply the
concepts defined earlier for Gaussian models to models for binary responses
(e.g., logit and probit), ordinal responses (e.g., ordered logit and ordered
probit), and count responses (e.g., Poisson). Models with multiple levels of
random variation are then considered, as well as models with crossed
(nonnested) random effects. The datasets used are real data from the
medical, social, and behavioral sciences literature, and several
thought-provoking exercises are included at the end of each chapter.
The text is loaded with applications of generalized mixed models performed in
Stata. The authors are the developers of gllamm, a Stata program that
can fit a vast array of latent-variable models, of which the generalized linear
mixed model is a special case. With the release of version 9, Stata
introduced the xtmixed command for fitting linear (Gaussian) mixed models.
These two commands, combined with the rest of the xt suite of Stata commands
(e.g., xtlogit, xtprobit), can be used in conjunction to
perform comparative mixed-model analyses for various response families. The
types of models fit by these commands sometimes overlap, and when this occurs
the authors highlight the differences in syntax, data organization, and output
for the two (or more) commands that can be used to fit the same model. The
text also points out the relative strengths and weaknesses of each command
when used to fit the same model, based on issues such as computational speed,
accuracy, and available predictions and postestimation statistics. In
particular, the relationship between gllamm and xtmixed and how they
complement each other is made very clear.
A reviewer for the American Statistician commends Rabe-Hesketh and
Skrondal for promoting the appropriate use of multilevel and longitudinal
modeling. He writes in the August 2006 issue, “All too often computer manuals
leave off ... important aspects of an analysis, but the authors have been
careful to provide a well-rounded and complete approach to model fitting and
interpretation.”
In summary, this text is the most complete and up-to-date depiction of Stata's
capacity for fitting generalized linear mixed models and an ideal
introduction for Stata users wishing to learn about this powerful
data-analysis tool.
Table of contents
Preface (pdf)
1 Linear variance-components models
- 1.1 Introduction
- 1.2 How reliable are expiratory flow measurements?
- 1.3 The variance-components model
- 1.3.1 Model specification and path diagram
- 1.3.2 Error components, variance components, and reliability
- 1.3.3 Intraclass correlation
- 1.4 Modeling the Mini Wright measurements
- 1.4.1 Estimation using xtreg
- 1.4.2 Estimation using xtmixed
- 1.4.3 Estimation using gllamm
- 1.4.4 Relative and absolute agreement
- 1.5 Estimation methods
- 1.6 Assigning values to the random intercepts
- 1.6.1 Maximum likelihood estimation
- Implementation via OLS regression
- Implementation via the mean total residual
- 1.6.2 Empirical Bayes prediction
- 1.6.3 Empirical Bayes variances
- 1.7 Summary and further reading
- 1.8 Exercises
2
Linear random-intercept models
- 2.1 Introduction
- 2.2 Are tax preparers useful?
- 2.3 The longitudinal data structure
- 2.4 Panel data and correlated residuals
- 2.5 The random-intercept model
- 2.5.1 Estimation using xtreg
- 2.5.2 Estimation using xtmixed
- 2.6 Different kinds of effects in panel models
- 2.6.1 Between-taxpayer effects
- 2.6.2 Within-taxpayer effects
- 2.6.3 Relations among the estimators
- 2.7 Endogeneity and between-taxpayer effects
- 2.8 Residual diagnostics
- 2.9 Summary and further reading
- 2.10 Exercises
3
Linear random-coefficient and growth-curve models
- 3.1 Introduction
- 3.2 How effective are different schools?
- 3.3 Separate linear regressions for each school
- 3.4 The random-coefficient model
- 3.4.1 Specification and interpretation of a random-coefficient model
- 3.4.2 Estimation and prediction using xtmixed
- Estimation of random-intercept model
- Estimation of random-coefficient model
- Empirical Bayes prediction using xtmixed
- 3.4.3 Estimation and prediction using gllamm
- Estimation of random-intercept model
- Estimation of random-coefficient model
- Empirical Bayes prediction
- 3.5 How do children grow?
- 3.6 Growth-curve modeling
- 3.6.1 Observed growth trajectories
- 3.6.2 Estimation using xtmixed
- Quadratic growth model with random intercept
- Quadratic growth model with random intercept and random slope
- Including a child-level covariate
- 3.6.3 Estimation using gllamm
- Quadratic growth model with random intercept
- Quadratic growth model with random intercept and random slope
- Including a child-level covariate
- 3.7 Two-stage model formulation
- 3.7.1 Model specification
- 3.7.2 Estimation
- 3.8 Prediction of trajectories for individual children
- 3.9 Complex level-1 variation or heteroskedasticity
- 3.10 Summary and further reading
- 3.11 Exercises
4
Dichotomous or binary responses
- 4.1 Models for dichotomous responses
- 4.1.1 Generalized linear model formulation
- 4.1.2 Latent-response formulation
- Logistic regression
- Probit regression
- 4.2 Which treatment is best for toenail infection?
- 4.3 The longitudinal data structure
- 4.4 Population-averaged or marginal probabilities
- 4.5 Random-intercept logistic regression
- 4.6 Subject-specific vs. population-averaged relationships
- 4.7 Maximum likelihood estimation using adaptive quadrature
- 4.7.1 Some practical considerations
- 4.8 Empirical Bayes (EB) predictions
- 4.8.1 EB prediction of random effects
- 4.8.2 EB prediction of response probabilities
- 4.9 Other approaches to clustered dichotomous data
- 4.9.1 Conditional logistic regression
- 4.9.2 Generalized estimating equations (GEE)
- 4.10 Summary and further reading
- 4.11 Exercises
5
Ordinal responses
- 5.1 Introduction
- 5.2 Cumulative models for ordinal responses
- 5.2.1 Generalized linear model formulation
- 5.2.2 Latent-response formulation
- 5.2.3 Proportional odds
- 5.2.4 Identification
- 5.3 Are antipsychotic drugs effective for patients with schizophrenia?
- 5.4 Longitudinal data structure and graphs
- 5.4.1 The longitudinal data structure
- 5.4.2 Plotting cumulative proportions
- 5.4.3 Plotting cumulative logits and transforming the time scale
- 5.5 A proportional-odds model
- 5.5.1 Model specification
- 5.5.2 Estimation
- 5.6 A random-intercept proportional-odds model
- 5.6.1 Model specification
- 5.6.2 Estimation
- 5.7 A random-coefficient proportional-odds model
- 5.7.1 Model specification
- 5.7.2 Estimation
- 5.8 Marginal and patient-specific probabilities
- 5.8.1 Marginal probabilities
- 5.8.2 Patient-specific cumulative response probabilities
- 5.9 Do experts differ in their grading of student essays?
- 5.10 A random-intercept model with grader bias
- 5.10.1 Model specification
- 5.10.2 Estimation
- 5.11 Including grader-specific measurement error variances
- 5.11.1 Model specification
- 5.11.2 Estimation
- 5.12 Including grader-specific thresholds
- 5.12.1 Model specification
- 5.12.2 Estimation
- 5.13 Summary and further reading
- 5.14 Exercises
6
Counts
- 6.1 Introduction
- 6.2 Types of counts
- 6.3 Poisson model for counts
- 6.4 Did the German health-care reform reduce the number of doctor visits?
- 6.5 Longitudinal data structure
- 6.6 Poisson regression ignoring overdispersion and clustering
- 6.6.1 Model specification
- 6.6.2 Estimation
- 6.7 Poisson regression with overdispersion but ignoring clustering
- 6.7.1 Using a level-1 random intercept
- Model specification
- Estimation
- 6.7.2 Quasilikelihood
- Specification
- Estimation
- 6.8 Random-intercept Poisson regression
- 6.8.1 Model specification
- 6.8.2 Estimation
- 6.9 Random-coefficient Poisson regression
- 6.9.1 Model specification
- 6.9.2 Estimation
- 6.10 Other approaches to clustered counts
- 6.10.1 Conditional Poisson regression
- 6.10.2 Generalized estimating equations (GEE)
- 6.11 Which Scottish countries have a high risk of lip cancer?
- 6.12 Standardized mortality ratios
- 6.13 Random-intercept Poisson regression
- 6.13.1 Model specification
- 6.13.2 Estimation
- 6.13.3 Introducing a county-level covariate
- 6.13.4 Prediction
- 6.14 Nonparametric maximum likelihood estimation
- 6.14.1 Specification
- 6.14.2 Estimation
- 6.14.3 Prediction
- 6.15 Summary and further reading
- 6.16 Exercises
7
Higher level models and nested random effects
- 7.1 Introduction
- 7.2 Which method is best for measuring expiratory flow?
- 7.3 Two-level variance-components models
- 7.3.1 Model specification
- 7.3.2 Estimation
- 7.4 Three-level variance-components models
- 7.4.1 Model specification
- 7.4.2 Different types of intraclass correlation
- 7.4.3 Three-stage formulation
- 7.4.4 Estimation using xtmixed
- 7.4.5 Prediction using xtmixed
- 7.5 Did the Guatemalan immunization campaign work?
- 7.6 A three-level logistic random-intercept model
- 7.6.1 Model specification
- 7.6.2 Different types of intraclass correlations for the latent responses
- 7.6.3 Three-stage formulation
- 7.6.4 Estimation
- 7.6.5 Introducing a random coefficient at level 3
- 7.6.6 Prediction
- 7.7 Summary and further reading
- 7.8 Exercises
8
Crossed random effects
- 8.1 Introduction
- 8.2 How does investment depend on expected profit and capital stock?
- 8.3 A two-way error-components model
- 8.3.1 Model specification
- 8.3.2 Intraclass correlations
- 8.3.3 Estimation
- 8.3.4 Prediction
- 8.4 How much do primary and secondary schools affect attainment at age 16?
- 8.5 An additive crossed random-effects model
- 8.5.1 Specification
- 8.5.2 Estimation
- 8.6 Including a random interaction
- 8.6.1 Model specification
- 8.6.2 Intraclass correlations
- 8.6.3 Estimation
- 8.6.4 Some diagnostics
- 8.7 A trick requiring fewer random effects
- 8.8 Summary and further reading
- 8.9 Exercises
A
Syntax for gllamm, eq, and gllapred
B
Syntax for gllamm
C
Syntax for gllapred
D
Syntax for gllasim
References
Author index (pdf)
Subject index (pdf)