Methods

NBEC sample procurement
Brush biopsy samples of normal bronchial epithelium were obtained for research studies at the time of diagnostic bronchoscopy according to previously described methods [9,11]. Normal bronchial epithelium in the lung not involved with cancer was brushed prior to biopsy of the suspected cancerous area. Samples were collected in a manner satisfying all requirements of the Institutional Review Board for the Medical University of Ohio. Each BC diagnosis and subtype identification was determined by histopathological examination in the Department of Pathology at the Medical University of Ohio. NBEC samples from a total of 49 individuals, including 24 non-BC individuals and 25 BC individuals, were evaluated in this study. The biographical characteristics of these individuals are presented in Table 1.

Transcript abundance measurement
Total RNA samples extracted from NBEC were reverse transcribed using M-MLV reverse transcriptase and oligo dT primers as previously described [9,11]. Standardized RT (StaRT)-PCR was used for transcript abundance measurement in these studies. With StaRT-PCR, an internal standard for each gene within a standardized mixture of internal standards (SMIS) is included in each PCR reaction. After amplification, products were electrophoresed on an Agilent 2100 Bioanalyzer using DNA Chips with DNA 1000 Kit reagents for visualization according to the manufacturer's protocol (Agilent Technologies Deutschland GmbH, Waldbronn, Germany).
The StaRT-PCR technology is licensed to Gene Express, Inc. (Toledo, OH). Many of the reagents are available commercially and were obtained through Gene Express, Inc. for this study. StaRT-PCR reagents for each of the measured genes that were not commercially available, including primers and SMIS, were prepared according to previously described methods [11,12]. Sequence information for the primers is provided in Table 2.
Including an internal standard within a SMIS in each measurement controls for all known sources of variation during PCR, including inhibitors in samples, and generates virtually-multiplexed transcript abundance data that are directly comparable across multiple experiments and institutions [13]. The performance characteristics of StaRT-PCR are superior to other forms of commercially available quantitative PCR technology in the areas critical to this study. With respect to these studies, the key property of a quantitative PCR method is not whether the PCR products are measured kinetically or at endpoint, but rather whether there are internal standards in each measurement or not. The overall performance characteristics of StaRT-PCR, including extensive validation of the method in independent laboratories have been presented in several recent articles and chapters [13-15]. With respect to the genes measured in this study, for each gene the StaRT-PCR reagents had lower detection threshold of less than 10 molecules, linear dynamic range of more than six orders of magnitude (less than 10 to over 107 molecules), and signal-to-analyte response of 100%. In addition, the presence of an internal standard controls for inter-sample variation in presence of PCR inhibitors (which often are gene-specific) and ensures no false negatives (if the PCR fails the internal standard PCR product is not observed and there are no data to report). False positives are eliminated through use of a control PCR reaction with no cDNA in it.

Statistical analysis
More than 6,000 transcript abundance measurements were conducted in multiple experiments over two years to assess the six transcription factors and sixteen antioxidant and DNA repair genes in NBEC samples from 49 individuals (24 non-BC individuals and 25 BC individuals).
Correlation of each of the six transcription factors with each of the antioxidant or DNA repair genes was determined by Pearson's correlation following logarithmic transformation. The transformation was necessary due to the wide biological variation in expression of each gene among the individuals. Significance level was defined as p < 0.01 following Bonferroni adjustment for multiple comparison, specifically comparison of each of six transcription factors to each of the antioxidant or DNA repair genes. Comparison for significant differences between pairs of correlation coefficients was done by Fisher's Z-transformation test [16].
Analysis of the relationship between virtually-multiplexed transcript abundance data for each gene with age was assessed by Pearson's correlation, with gender by t-test, and with smoking history by ANOVA followed by Duncan's test.

Transcription factor recognition site analysis
The El Dorado (Build 35) program from the Genomatix software package was used to locate the correlated genes within the genome and define 1101 base pairs of the promoter regions (1000 base pairs upstream of and 100 base pairs into the transcription start site) for each gene (Genomatix Software GmbH, Munich, Germany, [17]). The 1101 base pair sequences obtained from the El Dorado program then were used as the target sequences for putative transcription factor recognition site identification using the MatInspector Version 4.2 program, which yielded sites for 11 transcription factors (Genomatix Software GmbH, Munich, Germany, [17]). The parameters used were the standard (0.75) core similarity and the optimized matrix similarity [18]. StaRT-PCR reagents were optimized for ten of these transcription factors, including CEBPB, CEBPE, CEBPG, E2F1, E2F3, E2F4, E2F5, E2F6, EVI1, and PAX5. Four transcription factors were expressed at low and invariant levels among multiple NBEC samples and were therefore excluded from the study. The remaining six, CEBPB, CEBPG, E2F1, E2F3, E2F6, and EVI, were evaluated for correlation with an expanded group of ten antioxidant and six DNA repair genes.