# Pollutantmean Assignment Discovery

### Study design and study population

A series of air pollution control measures were implemented from July 20 to September 17, 2008, encompassing the Olympic Games (August 8–24) through the end of the Paralympic Games (September 6–17). These control measures created the opportunity for a study design with ‘high-low-high’ pollution levels. Our study included three periods: (1) the pre-Olympic period (June 2–July 20) when light air pollution control measures were implemented, (2) the during-Olympic period (July 21–September 20) when industrial and commercial combustion facility operation and vehicle use were strictly controlled, and (3) the post-Olympic period (September 21–October 30) when the pollution control measures were relaxed [6], [7]. This panel study of air pollution and biomarkers of cardio-respiratory pathology was performed on the campus of Peking University First Hospital, Beijing (Latitude: 39.9272, Longitude: 116.3722).

We enrolled 125 young adult never-smokers who were free of cardio-respiratory, liver, kidney, neurologic, and other chronic diseases. Most study participants were medical residents working at the hospital and all participants lived within 9 km of the hospital. Participants were invited for clinical visits (between 8AM to 10AM) twice in each of the pre-, during-, and post-Olympic periods, in which the two visits were designed to be two weeks apart and at the same day of week. Participants were required to fast overnight before the clinical visits, refrained from taking any medications, working nightshifts or travelling, and were free of symptoms of respiratory infection or allergies within seven days prior to each clinical visit. The study population and data collection methods have been described in detail in previous publications [5]–[8].

This study was approved by the University of Medicine and Dentistry of New Jersey institutional review board and the joint Ethics Committee of the Peking University Health Sciences Center and the Peking University First Hospital. All participants provided written informed consent before participating in the study.

### Air pollution measurement

Air pollutants were monitored throughout all the three Olympic period (June 2-October 30, 2008). During these periods, we measured ambient concentrations of sulfur dioxide (SO_{2}), nitrogen dioxide (NO_{2}), ozone (O_{3}), carbon monoxide (CO), fine particulate matter (PM_{2.5}), and its constituents, elemental carbon (EC), organic carbon (OC), and sulfate (SO_{4}^{2−}); temperature and relative humidity (RH) were also recorded. Measurements were conducted on the roof of a seven-story building (∼20 meters above the ground) in the center of the hospital campus. We calculated average pollutant concentrations over seven-day periods before the time point where biological samples were collected according to the number of hours away from the sample collection (0–23 hours = lag 0, etc.). We excluded O_{3} from these analyses due to the strong negative correlation with other pollutants noted in our prior publications [5], [6], [8]. Additional description is provided in Supporting Information (S1 Appendix).

### Biomarker Measurements

We grouped the assessed biomarkers into 4 *a priori* candidate physiological pathways, including pulmonary inflammation and oxidative stress, autonomic function, hemostasis, and systemic inflammation and oxidative stress, based on biological activity and previous literature [5], [7]. We grouped inflammatory and oxidative stress biomarkers together because oxidative stress is often induced by and elicits inflammatory processes [1].

*Pulmonary inflammation and oxidative stress* were assessed using fractional exhaled nitric oxide (FeNO) and exhaled breath condensate (EBC) biomarkers, including pH value, nitrite, and malondialdehyde (MDA).

*Autonomic function* was assessed by systolic blood pressure (SBP), diastolic blood pressure (DBP), heart rate and heart rate variability (HRV), including standard deviation of normal R-R intervals (SDNN), root mean square of successive differences between adjacent normal cycles (rMSSD), low frequency (LF) power, high-frequency (HF) power, very low frequency (VLF) power, ratio of LF to HF, and total power.

*Hemostasis* markers included soluble P-selectin (sCD62P), CD40 Ligand (sCD40L), and von Willebrand Factor (VWF).

*Systemic inflammation and oxidative stress* markers included fibrinogen, red blood cells (RBC), white blood cells (WBC), and C-reactive protein (CRP) in plasma, as well as MDA and 8-Hydroxy-2′-deoxyguanosine (8-OHdG) in urine. CRP was excluded for these analyses due to a large number of non-detects (∼53%). Urinary concentrations of 8-OHdG and MDA were normalized by creatinine concentrations.

Additional description is provided in Supporting Information (S2 Appendix).

### Statistical analysis

Exploratory univariate and bivariate analyses were conducted to identify outliers and potential confounders of the relationships between biomarkers and pollutants. Values of EBC pH were multiplied by -1 so that higher levels would be considered a worse health condition for all biomarkers. Each biomarker and air pollutant level was internally standardized by [(*x _{i}*-mean

*)/SD*

_{x}_{x}], where

*X*is each individual observation of a biomarker or pollutant, mean

_{i}*and SD*

_{x}_{x}are grand mean and grand standard deviation of this biomarker or pollutant. We then developed and applied a two-stage statistical analysis (Fig. 1).

In Stage I, we used the mixed-effect models (Eq. 1) to estimate the association coefficients () of a specific biomarker (*b*) with a specific pollutant (*p*) at a specific lag day (*l*). (Eq. 1)where denotes the standardized value of biomarker *b* for participant *i* at visit *t*, is the grand mean of biomarker *b* at visit *t*, denotes the association coefficient of biomarker *b* with pollutant *p* at lag *l* at visit *t*, is the standardized concentration of pollutant *p* at lag *l* of visit *t*, and denotes the random error of the standardized concentration of biomarker *b* for participant *i* at visit *t*.

In these models, we adjusted for the following potential confounders (represented by ‘…’ in Eq. 1): sex, indicators of day of week, and smooth functions of temperature and relative humidity and included participant-level random intercepts () to account for repeated measurements on participants. Stage I model selection has been explained in detail previously [5], [6], [8]. Since the biomarkers and lagged pollutants were standardized, the Stage I, have similar interpretations, which facilitates comparison in Stage II. For any biomarker-pollutant-lag combination, represents the difference in biomarker *b* associated with one standard deviation (SD) increase in pollutant *p* at lag *l*.

Stage II models were developed to explain variation in the temporally resolved biomarker-specific effects of each pollutant. Our statistical approach is an extension of repeated measures ANOVA. Specifically, Stage II consisted of a single linear mixed-effects model for estimates () with inverse variance weighting to account for the wide range of standard errors (0.013 to 0.093) of : (Eq. 2)

In Eq. 2, differences in mean at lag 0 across pollutants are quantified by and differences in mean at lag 0 across pathways are quantified by where denotes the pathway to which biomarker *b* is assigned. For identifiability, for a reference pollutant (here PM_{2.5}, denoted pollutant *p* = 1) and a reference pathway (here, systemic inflammation/oxidative stress, denoted pathway *w* = 1), and are both set to zero so that *µ _{0}* quantifies mean at lag 0 in the reference pathway and reference pollutant. Additional differences in mean at lag 0 due to interactions between pathways and pollutants are quantified by

*θ*, with similar identifiability constraints for the reference pathway and pollutant. Biomarker-level random effects are represented by , , and . We specified an unstructured covariance matrix for the random effects and an autoregressive covariance matrix (AR-1) for the residuals as a function of lag to account for possible autocorrelation of from the same biomarker-pollutant combination, across different lags.

_{0wp}Rather than assume a linear effect of lag on , we used a piecewise linear spline with a change point (knot) in the middle of the 7 day period, at lag 3. This was a natural, not data-driven choice for the change point, and this approach offered a simple and reasonable representation of the patterns of association observed in Fig. 2. The spline is represented using two sets of terms, where the variable takes values 0, 1, 2, 3, 4, 5, 6 and takes values 0, 0, 0, 0, 1, 2, 3. This relatively simple structure allowed us to investigate general patterns in the associations of biomarkers with each pollutant over 7 days, borrowing strength across biomarkers in the same pathway. For example, for ‘average’ biomarkers (where , , and ) in the reference pathway (systemic inflammation/oxidative stress), the mean at lag 0 for the reference pollutant (PM_{2.5}) is , the daily rate of change in mean from lag 0 to lag 3 is and the daily rate of change in mean from lag 3 to lag 6 is so that quantifies the difference in slopes between lags 0–3 and lags 3–6. As in a sensitivity analysis, we compared the AIC of our final model to that of otherwise identical models that used change points of and (1 and 5 were considered too close to the endpoints to be meaningful) and found that minimized AIC for the final model.

_{2.5}concentrations and standardized biomarkers in each pathway, from Stage I models.

Error bars represent 95% confidence intervals. Effect sizes are scaled to a 1 standard deviation change in PM_{2.5} (51.9 µg/m3).

https://doi.org/10.1371/journal.pone.0114913.g002

Overall, the grouping of biomarkers into physiological pathways allowed us to quantify and evaluate: (a) differences in associations at lag 0 across pathways () and across pollutants (), (b) whether pathway-level associations at lag 0 varied by pollutant (), and (c) pollutant-specific, pathway-level temporal patterns of association (and , for the reference pathway and pollutant). For model selection, we evaluated evidence for a more complex model versus a more parsimonious model using likelihood ratio tests. We obtained predictions of from the Stage II model using empirical Bayes predictions of the biomarker-level random effects.

Stage I statistical analyses were performed using the R Programming Language (Version 2.12.2; R Development Core Team) and Stage II analyses were performed using SAS (Version 9.3).

Ну да, это ночной рейс в выходные - Севилья, Мадрид, Ла-Гуардиа. Его так все называют. Им пользуются студенты, потому что билет стоит гроши. Сиди себе в заднем салоне и докуривай окурки. Хорошенькая картинка.