Reduced-Rank Spatio-Temporal Modeling of Air Pollution Concentrations in the Multi-Ethnic Study of Atherosclerosis and Air Pollution.
ABSTRACT: There is growing evidence in the epidemiologic literature of the relationship between air pollution and adverse health outcomes. Prediction of individual air pollution exposure in the Environmental Protection Agency (EPA) funded Multi-Ethnic Study of Atheroscelerosis and Air Pollution (MESA Air) study relies on a flexible spatio-temporal prediction model that integrates land-use regression with kriging to account for spatial dependence in pollutant concentrations. Temporal variability is captured using temporal trends estimated via modified singular value decomposition and temporally varying spatial residuals. This model utilizes monitoring data from existing regulatory networks and supplementary MESA Air monitoring data to predict concentrations for individual cohort members. In general, spatio-temporal models are limited in their efficacy for large data sets due to computational intractability. We develop reduced-rank versions of the MESA Air spatio-temporal model. To do so, we apply low-rank kriging to account for spatial variation in the mean process and discuss the limitations of this approach. As an alternative, we represent spatial variation using thin plate regression splines. We compare the performance of the outlined models using EPA and MESA Air monitoring data for predicting concentrations of oxides of nitrogen (NO x )-a pollutant of primary interest in MESA Air-in the Los Angeles metropolitan area via cross-validated R2. Our findings suggest that use of reduced-rank models can improve computational efficiency in certain cases. Low-rank kriging and thin plate regression splines were competitive across the formulations considered, although TPRS appeared to be more robust in some settings.
Project description:Studies estimating health effects of long-term air pollution exposure often use a two-stage approach: building exposure models to assign individual-level exposures, which are then used in regression analyses. This requires accurate exposure modeling and careful treatment of exposure measurement error.To illustrate the importance of accounting for exposure model characteristics in two-stage air pollution studies, we considered a case study based on data from the Multi-Ethnic Study of Atherosclerosis (MESA).We built national spatial exposure models that used partial least squares and universal kriging to estimate annual average concentrations of four PM2.5 components: elemental carbon (EC), organic carbon (OC), silicon (Si), and sulfur (S). We predicted PM2.5 component exposures for the MESA cohort and estimated cross-sectional associations with carotid intima-media thickness (CIMT), adjusting for subject-specific covariates. We corrected for measurement error using recently developed methods that account for the spatial structure of predicted exposures.Our models performed well, with cross-validated R2 values ranging from 0.62 to 0.95. Naïve analyses that did not account for measurement error indicated statistically significant associations between CIMT and exposure to OC, Si, and S. EC and OC exhibited little spatial correlation, and the corrected inference was unchanged from the naïve analysis. The Si and S exposure surfaces displayed notable spatial correlation, resulting in corrected confidence intervals (CIs) that were 50% wider than the naïve CIs, but that were still statistically significant.The impact of correcting for measurement error on health effect inference is concordant with the degree of spatial correlation in the exposure surfaces. Exposure model characteristics must be considered when performing two-stage air pollution epidemiologic analyses because naïve health effect inference may be inappropriate.
Project description:In recent years, with rapid industrialization and massive energy consumption, ground-level ozone ( O 3 ) has become one of the most severe air pollutants. In this paper, we propose a functional spatio-temporal statistical model to analyze air quality data. Firstly, since the pollutant data from the monitoring network usually have a strong spatial and temporal correlation, the spatio-temporal statistical model is a reasonable method to reveal spatial correlation structure and temporal dynamic mechanism in data. Secondly, effects from the covariates are introduced to explore the formation mechanism of ozone pollution. Thirdly, considering the obvious diurnal pattern of ozone data, we explore the diurnal cycle of O 3 pollution using the functional data analysis approach. The spatio-temporal model shows great applicational potential by comparison with other models. With application to O 3 pollution data of 36 stations in Beijing, China, we give explanations of the covariate effects on ozone pollution, such as other pollutants and meteorological variables, and meanwhile we discuss the diurnal cycle of ozone pollution.
Project description:A hybrid approach is proposed to estimate exposure to fine particulate matter (PM(2.5)) at a given location and time. This approach builds on satellite-based aerosol optical depth (AOD), air pollution data from sparsely distributed Environmental Protection Agency (EPA) sites and local time-space Kriging, an optimal interpolation technique. Given the daily global coverage of AOD data, we can develop daily estimate of air quality at any given location and time. This can assure unprecedented spatial coverage, needed for air quality surveillance and management and epidemiological studies. In this paper, we developed an empirical relationship between the 2?km AOD and PM(2.5) data from EPA sites. Extrapolating this relationship to the study domain resulted in 2.3 million predictions of PM(2.5) between 2000 and 2009 in Cleveland Metropolitan Statistical Area (MSA). We have developed local time-space Kriging to compute exposure at a given location and time using the predicted PM(2.5). Daily estimates of PM(2.5) were developed for Cleveland MSA between 2000 and 2009 at 2.5?km spatial resolution; 1.7?million (?79.8%) of 2.13?million predictions required for multiyear and geographic domain were robust. In the epidemiological application of the hybrid approach, admissions for an acute exacerbation of chronic obstructive pulmonary disease (AECOPD) was examined with respect to time-space lagged PM(2.5) exposure. Our analysis suggests that the risk of AECOPD increases 2.3% with a unit increase in PM(2.5) exposure within 9 days and 0.05° (?5?km) distance lags. In the aggregated analysis, the exposed groups (who experienced exposure to PM(2.5) >15.4??g/m(3)) were 54% more likely to be admitted for AECOPD than the reference group. The hybrid approach offers greater spatiotemporal coverage and reliable characterization of ambient concentration than conventional in situ monitoring-based approaches. Thus, this approach can potentially reduce exposure misclassification errors in the conventional air pollution epidemiology studies.
Project description:Childhood asthma morbidity has been associated with short-term air pollution exposure. To date, most investigations have used time-series models, and it is not well understood how exposure misclassification arising from unmeasured spatial variation may impact epidemiological effect estimates. Here, we develop case-crossover models integrating temporal and spatial individual-level exposure information, toward reducing exposure misclassification in estimating associations between air pollution and child asthma exacerbations in New York City (NYC).Air pollution data included: (a) highly spatially-resolved intra-urban concentration surfaces for ozone and co-pollutants (nitrogen dioxide and fine particulate matter) from the New York City Community Air Survey (NYCCAS), and (b) daily regulatory monitoring data. Case data included citywide hospital records for years 2005-2011 warm-season (June-August) asthma hospitalizations (n=2353) and Emergency Department (ED) visits (n=11,719) among children aged 5-17 years. Case residential locations were geocoded using a multi-step process to maximize positional accuracy and precision in near-residence exposure estimates. We used conditional logistic regression to model associations between ozone and child asthma exacerbations for lag days 0-6, adjusting for co-pollutant and temperature exposures. To evaluate the effect of increased exposure specificity through spatial air pollution information, we sequentially incorporated spatial variation into daily exposure estimates for ozone, temperature, and co-pollutants.Percent excess risk per 10ppb ozone exposure in spatio-temporal models were significant on lag days 1 through 5, ranging from 6.5 (95% CI: 0.2-13.1) to 13.0 (6.0-20.6) for inpatient hospitalizations, and from 2.9 (95% CI: 0.1-5.7) to 9.4 (6.3-12.7) for ED visits, with strongest associations consistently observed on lag day 2. Spatio-temporal excess risk estimates were consistently but not statistically significantly higher than temporal-only estimates on lag days 0-3.Incorporating case-level spatial exposure variation produced small, non-significant increases in excess risk estimates. Our modeling approach enables a refined understanding of potential measurement error in temporal-only versus spatio-temporal air pollution exposure assessments. As ozone generally varies over much larger spatial scales than that observed within NYC, further work is necessary to evaluate potential reductions in exposure misclassification for populations spanning wider geographic areas, and for other pollutants.
Project description:Epidemiologic evidence consistently links urban air pollution exposures to health, even after adjustment for potential spatial confounding by socioeconomic position (SEP), given concerns that air pollution sources may be clustered in and around lower-SEP communities. SEP, however, is often measured with less spatial and temporal resolution than are air pollution exposures (i.e., census-tract socio-demographics vs. fine-scale spatio-temporal air pollution models). Although many questions remain regarding the most appropriate, meaningful scales for the measurement and evaluation of each type of exposure, we aimed to compare associations for multiple air pollutants and social factors against cardiovascular disease (CVD) event rates, with each exposure measured at equal spatial and temporal resolution. We found that, in multivariable census-tract-level models including both types of exposures, most pollutant-CVD associations were non-significant, while most social factors retained significance. Similarly, the magnitude of association was higher for an IQR-range difference in the social factors than in pollutant concentrations. We found that when offered equal spatial and temporal resolution, CVD was more strongly associated with social factors than with air pollutant exposures in census-tract-level analyses in New York City.
Project description:OBJECTIVES:We aim to characterize the qualities of estimation approaches for individual exposure to ambient-origin fine particulate matter (PM2.5), for use in epidemiological studies. METHODS:The analysis incorporates personal, home indoor, and home outdoor air monitoring data and spatio-temporal model predictions for 60 participants from the Multi-Ethnic Study of Atherosclerosis and Air Pollution (MESA Air). We compared measurement-based personal PM2.5 exposure with several measured or predicted estimates of outdoor, indoor, and personal exposures. RESULTS:The mean personal 2-week exposure was 7.6 (standard deviation 3.7) µg/m3. Outdoor model predictions performed far better than outdoor concentrations estimated using a nearest-monitor approach (R?=?0.63 versus R?=?0.43). Incorporating infiltration indoors of ambient-derived PM2.5 provided better estimates of the measurement-based personal exposures than outdoor concentration predictions (R?=?0.81 versus R?=?0.63) and better scaling of estimated exposure (mean difference 0.4 versus 5.4?µg/m3 higher than measurements), suggesting there is value to collecting data regarding home infiltration. Incorporating individual-level time-location information into exposure predictions did not increase correlations with measurement-based personal exposures (R?=?0.80) in our sample consisting primarily of retired persons. CONCLUSIONS:This analysis demonstrates the importance of incorporating infiltration when estimating individual exposure to ambient air pollution. Spatio-temporal models provide substantial improvement in exposure estimation over a nearest monitor approach.
Project description:Spatial modeling of air pollution exposures is widespread in air pollution epidemiology research as a way to improve exposure assessment. However, there are key sources of exposure model uncertainty when air pollution is modeled, including estimation error and model misspecification. We examine the use of predicted air pollution levels in linear health effect models under a measurement error framework. For the prediction of air pollution exposures, we consider a universal Kriging framework, which may include land-use regression terms in the mean function and a spatial covariance structure for the residuals. We derive the bias induced by estimation error and by model misspecification in the exposure model, and we find that a misspecified exposure model can induce asymptotic bias in the effect estimate of air pollution on health. We propose a new spatial simulation extrapolation (SIMEX) procedure, and we demonstrate that the procedure has good performance in correcting this asymptotic bias. We illustrate spatial SIMEX in a study of air pollution and birthweight in Massachusetts.
Project description:Low-cost urban air quality sensor networks are increasingly used to study the spatio-temporal variability in air pollutant concentrations. Recently installed low-cost urban sensors, however, are more prone to result in erroneous data than conventional monitors, e.g., leading to outliers. Commonly applied outlier detection methods are unsuitable for air pollutant measurements that have large spatial and temporal variations as occur in urban areas. We present a novel outlier detection method based upon a spatio-temporal classification, focusing on hourly NO2 concentrations. We divide a full year's observations into 16 spatio-temporal classes, reflecting urban background vs. urban traffic stations, weekdays vs. weekends, and four periods per day. For each spatio-temporal class, we detect outliers using the mean and standard deviation of the normal distribution underlying the truncated normal distribution of the NO2 observations. Applying this method to a low-cost air quality sensor network in the city of Eindhoven, the Netherlands, we found 0.1-0.5% of outliers. Outliers could reflect measurement errors or unusual high air pollution events. Additional evaluation using expert knowledge is needed to decide on treatment of the identified outliers. We conclude that our method is able to detect outliers while maintaining the spatio-temporal variability of air pollutant concentrations in urban areas.
Project description:China has recently made available hourly air pollution data from over 1500 sites, including airborne particulate matter (PM), SO2, NO2, and O3. We apply Kriging interpolation to four months of data to derive pollution maps for eastern China. Consistent with prior findings, the greatest pollution occurs in the east, but significant levels are widespread across northern and central China and are not limited to major cities or geologic basins. Sources of pollution are widespread, but are particularly intense in a northeast corridor that extends from near Shanghai to north of Beijing. During our analysis period, 92% of the population of China experienced >120 hours of unhealthy air (US EPA standard), and 38% experienced average concentrations that were unhealthy. China's population-weighted average exposure to PM2.5 was 52 ?g/m3. The observed air pollution is calculated to contribute to 1.6 million deaths/year in China [0.7-2.2 million deaths/year at 95% confidence], roughly 17% of all deaths in China.
Project description:BACKGROUND:Current epidemiologic studies rely on simple ozone metrics which may not appropriately capture population ozone exposure. For understanding health effects of long-term ozone exposure in population studies, it is advantageous for exposure estimation to incorporate the complex spatiotemporal pattern of ozone concentrations at fine scales. OBJECTIVE:To develop a geo-statistical exposure prediction model that predicts fine scale spatiotemporal variations of ambient ozone in six United States metropolitan regions. METHODS:We developed a modeling framework that estimates temporal trends from regulatory agency and cohort-specific monitoring data from MESA Air measurement campaigns and incorporates land use regression with universal kriging using predictor variables from a large geographic database. The cohort-specific data were measured at home and community locations. The framework was applied in estimating two-week average ozone concentrations from 1999 to 2013 in models of each of the six MESA Air metropolitan regions. RESULTS:Ozone models perform well in both spatial and temporal dimensions at the agency monitoring sites in terms of prediction accuracy. City-specific leave-one (site)-out cross-validation R2 accounting for temporal and spatial variability ranged from 0.65 to 0.88 in the six regions. For predictions at the home sites, the R2 is between 0.60 and 0.91 for cross-validation that left out 10% of home sites in turn. The predicted ozone concentrations vary substantially over space and time in all the metropolitan regions. CONCLUSION:Using the available data, our spatiotemporal models are able to accurately predict long-term ozone concentrations at fine spatial scales in multiple regions. The model predictions will allow for investigation of the long-term health effects of ambient ozone concentrations in future epidemiological studies.