Time Series Study of Air Pollution Health Effects in COPSAC Children

Appendix C

Within Personal Correlation - Comparing GAM Model with Binomial GEE Model

Generalized Additive Models (GAM) models are traditionally used in air pollution epidemiology where ready available outcome data on daily hospital admissions from hospital registers or daily mortality data from death registers are associated with daily fluctuations in air pollutant levels, adjusted for weather and seasonal confounders. In these studies, daily count of an outcome (hospital admissions, deaths, etc.) comes from large registers without available personal identification or person-level socio-demographic characteristic variables. Thus, limited by available data, in this model it is usually assumed that counts of deaths, hospital admissions, etc. are independent from day-to-day.

In this study, we have a different approach where our outcome comes from a small prospective cohort of COPSAC children with well defined personal characteristics and detailed outcome data for each child. We are using GAM Poisson models due to the nature of air pollution data which are available only as daily averages from a single centrally placed measurement station, and thus we are correspondingly summarizing our outcome into daily counts of incident respiratory symptoms, assuming day-to-day independence. However, we know from COPSAC cohort that these events are not independent, and that most children who experience respiratory symptoms, have recurrent events. For example, in Population 1, we have 115 children, follow-up of which results in 346 incidences in 18 months. It is mostly same children who experience events, and this within person correlation is ignored in GAN model. Thus, independence assumption in the GAM models here is too naive, and may affect standard errors of our estimate.

An obvious solution to this problem is to fit a mixed GAM model with a random effect term, or preferably with GEE , but this model is not implemented and readily available in statistical software.

To get an idea of how much GAM model estimated standard errors of the estimates are affected by this model assumption, we rearrange our data into a longitudinal format, by adding person identification to each record of data, and thus fit Poisson GEE model. In this model, effects of temperature and time are modeled linearly, which we know is not optimal (this we use GAM model), but within person correlation of the outcome is accounted for by the robust GEE variance estimator. Results comparing two models can be seen in Table D.1 below. From Table D.1 we can see that there is no significant difference in estimates between two models. Both point at the same strong effect of a 4-day lag, and of positive but nonsgnificant accumulated effect over 5 days.

Table D.1: Comparison of GAM and GEE Poisson Modelin modeling effect of city background PM10, in population 1 (inner city Copenhagen).

  GAM Poisson Unconstrained
Distributed Lag Model
GEE Poisson Unconstrained
Distributed Lag Model*
RR β (se) p RR β (se) p
HCØ (City Background)
n 1.274 31.514
Lag 0 1.002 0.002 (0.01) 0.77 1.003 0.003 (0.00) 0.55
Lag 1 0.995 -0.005 (0.01) 0.49 0.996 -0.004 (0.00) 0.39
Lag 2 1.000 0.000 (0.01) 0.95 0.998 -0.002 (0.00) 0.62
Lag 3 1.001 0.001 (0.01) 0.83 0.998 -0.002 (0.00) 0.63
Lag 4 1.009 0.009 (0.01) 0.18 1.010 0.010 (0.00) 0.00
Lag 5 0.996 -0.004 (0.01) 0.49 1.000 -0.000 (0.00) 0.99
6-day
Moving
Average
Lag Model
1.010 0.010 (0.01) 0.07 1.007 0.007 (0.01 0.29

* in GEE model time and temperature where modeled linearly

 



Version 1.0 Maj 2005, © Danish Environmental Protection Agency