2. Methods

2.1. Exon Array (Experiment 1)
Anterior cingulate cortex (ACC) samples were used in an exon array (Affymetrix HuEx 1.0 ST, Santa Clara, CA, USA. ACC samples from 9 bipolar disorder patients and 11 healthy controls were used in this study for initial microarray analysis. The demographics are shown in Table 1. Approximately 100 mg of dissected frozen tissue was homogenized using Trizol Reagent (Invitrogen, Carlsbad, CA, USA) following the standard RNA Trizol isolation procedure: 1 mL Trizol was added to frozen brain, then Trizol mixture was homogenized for 30 s twice at 7500 rpm using Tissue Tearor (Biospec Products, Inc., Bartlesville, OK, USA) in ice. The mixture was subsequently incubated at room temperature for 5 min, 200 µL of chloroform added, the tube shaken by hand for 30 s and the mixture then incubated for 2–3 min at reverse transcription (RT), before being centrifuged at 12,000 g for 15 min at 4 °C with the Eppendorf Centrifuge 5417R (Eppendorf, Hauppauge, NY, USA). The supernatant containing the upper aqueous phase was transferred to a new tube, mixed with 500 µL of isopropyl alcohol and incubated for 15 min at RT and centrifuged at 12,000 g for 10 min at 4 °C. The supernatant was removed, and the pellet was washed with 1 mL of iced 75% ethanol, by brief vortex, then centrifuged at 7500 g for 10 min at 4 °C. Ethanol was decanted, and RNA pellet was dried at RT for 5–10 min in a laboratory hood by opening tube lid; RNA was then dissolved in 50 µL DEPC-treated water by gently mixing on ice. RNA was stored in a −80 °C freezer. The resulting total RNA was cleaned of low molecular weight fragments by passing through a Qiagen column, and checked on an Agilent 2100 Bioanalyzer (Agilent Technologies, Santa Clara, CA, USA) for quality control using RNA integrity number. The concentrations measured on a spectrophotometer (Molecular Devices, Sunnyvale, CA, USA) were adjusted to 1 µg/µL.
GeneChip Whole Transcript Sense Assay Protocol:
The Affymetrix Human GeneChip Exon 1.0 ST arrays were run following the manufacturer’s protocol (Affymetrix, Santa Clara, CA, USA). Briefly, 2 μg of purified total RNA underwent ribosomal RNA removal using the RiboMinus Human/Mouse Transcriptome Isolation Kit (Invitrogen). The reduced RNA was then reverse transcribed to cDNA using random hexamers tagged with a T7 promoter sequence followed by a second strand cDNA synthesis using DNA polymerase (GeneChip WT cDNA Synthesis and Amplification Kit, Affymetrix). The resulting double-stranded cDNA was then used for amplification of antisense cRNA using the Gene Chip Sample Cleanup Module (Affymetrix). A second cycle cDNA synthesis was performed using random primers to reverse transcribe the cRNA into sense single-stranded DNA. The DNA was then enzymatically fragmented and labeled using the GeneChip WT Terminal Labeling Kit (Affymetrix). The hybridization cocktail consisting of the labeled sample, Control Oligonucleotide B2, and 20× Eukaryotic Hybridization Controls were heated for 5 min at 99 °C and cooled for 5 min at 45 °C, then centrifuged 1 min. A volume of 200 µL was loaded onto the Affymetrix Human Gene Chip Exon 1.0 ST Arrays and the arrays were placed in a 45 °C hybridization oven, at 60 rpm, to incubate for 17 h. The GeneChip Hybridization, Wash and Stain Kit (Affymetrix) was used with the Fluidics Station 450_0001 protocol. Arrays were then scanned on the GeneChip Scanner 3000 7G (Affymetrix). All exon array samples were processed in the same batch by one person.
Each CEL file from the Affymetrix HuEx 1.0 ST was imported into Partek Genomics Suite 6.6 using background subtraction and elimination of probes with common SNPs from analysis of exon array data following the method of Gamazon et al. 2010 [30]. The initial count of probesets on the array is 1.1 million (see Supplementary Methods), elimination of common SNPs in probes reduced the probeset count by ~350,000 probesets. Each CEL file was analyzed together using robust multiarray analysis (RMA) [31]. Resulting expression values of probes were averaged in each probeset. Each probeset was aligned to a unique RefSeq gene, and we report only findings that are associated with full-length mRNA and have coverage by at least two probesets. This reduced the total probesets analysis to 230,659 probesets representing 11,807 full-length RefSeq genes. The diagnosis by probeset interaction was calculated in Partek (Supplementary Methods), and the interaction p-values cut-off were determined after Bonferroni correction.

2.2. qPCR (Experiment 2)
These same subjects in Experiment 1 were used for qPCR, plus ~90 additional subjects for Experiment 2 (shown in Supplementary Table 1). Total RNA was extracted from five brain regions (dorsolateral prefrontal cortex (DLPFC), amygdala, hippocampus, nucleus accumbens, and cerebellum) for each subject using the method outlined in Section 2.1. Total RNA from the DLPFC, amygdala, hippocampus, nucleus accumbens, and cerebellum were used for making complementary DNA (cDNA) (Table 2). cDNA was generated using TaqMan reverse-transcription (RT) reagents according to the manufacturer’s protocol (Applied Biosystems, Foster City, CA, USA), and cDNAs were aliquoted and stored at −20 °C. In brief, the cDNA synthesis contained 5 µL of 10× Taqman RT buffer; 11 µL of 25 mM MgCl2; 10 µL of deoxy NTPs; 2.5 µL of Oligo d(T)16 primer; 1 µL of RNase inhibitor; 1.25 µL of Multiscribe reverse transcriptase, and 1 µL of RNA (1 µg/µL). Two separate 50 µL reactions for each RNA were performed and combined together. Each cDNA batch reaction had a maximum of 24 tubes to ensure the best sample quality. The primers were designed using Primer Express software (Applied Biosystems) and purchased from Bioneer, Inc. Factors including melting temperature and guanine-cytosine (GC) content were considered. The HLA-DPA1 forward and reverse primers were designed to hybridize to sequences located in exon 3, near the site of hybridization for the probe (Probe Set 2950343; Affymetrix, Inc.). The primer set was BLAST searched against the entire human genomic sequence database for specificity, and primers used are shown in Supplementary Methods, Part 3 for all genes. The HLA-DPA1 primers were tested by using a set of cDNAs from cerebellum), genomic DNA, no template control (NTC), and RT minus (two individual DLPFC RNAs without cDNA). The primer test results showed that all cDNA amplified with a single band, while the NTC and gDNA amplified greater than 40. The RT minus showed greater than six cycles difference from the cDNA samples. This detection ensured that the HLA-DPA1 primers were specific to HLA-DPA1. Similar procedures were used for HLA-DRB1 and CD74 qPCR analyses.
Quantitative PCR (qPCR) was performed on an ABI 7900HT Sequence Detection System (Applied Biosystems) in 384-well plates. The samples were aliquoted by the Biomek3000 Robot (Beckman Coulter, Brea, CA, USA) and run in triplicate using one plate per gene. The reaction was performed in a 12.5 µL total volume with 6.25 µL of 2× SYBR Green Master Mix (Applied Biosystems); 0.25 µL of 10 µM forward primer; 0.25 µL of 10 µM reverse primer; 2 µL of a 1:10 dilution of cDNA template (corresponding to approximately 4 ng RNA), and water to a total volume of 12.5 µL. The thermal cycle conditions were: 50 °C for 2 min (incubation), 95 °C for 10 min (activation), 45 cycles at 95 °C for 15 s (denaturation), and 60 °C for 1 min (annealing/extension), and a final dissociation step at 95 °C for 15 s, 60 °C for 15 s, and 95 °C for 15 s. The qPCR cycle threshold (Ct) was set in the middle of the exponential phase of the amplification. In each experiment, the individual sample was run in triplicate and the Ct of each well was recorded at the end of the reaction. The mean and standard deviation (SD) of the three Cts were calculated and the average value was accepted if the triplicate Ct values were within ±1 Ct. Two representative qPCR runs of HLA-DPA1 robotically pipetted in 384-well assay plates were examined for two brain regions, and three wells were eliminated. The average coefficient of variation was ~0.8% for each plate. The relative quantification was used to measure gene expression. Glyceraldehyde-3-phosphate dehydrogenase (GAPDH) and succinate dehydrogenase complex subunit A (SDHA) were selected as the housekeeping genes. After correction with the mean of the two housekeeping genes (Ct target − Ct mean of housekeeping), an ANCOVA was used for the average delta Ct values for each subject. A repeated-subjects ANCOVA was used with factorial blocks of diagnosis, region, and SNP rs9277341, and also included age, RIN, and pH covariates. The fold change in gene expression was calculated to elucidate the direction of differences in mRNA levels between diagnosis and control samples in each brain region.

2.3. Alternative Splicing of HLA-DPA1 (Experiment 3)
By exon array, a variable expression pattern was observed across exons 2, 3, and 4 in HLA-DPA1 for multiple subjects. This gene was therefore chosen for analysis of alternative splicing by direct sequencing of gel-purified cDNA amplicons to evaluate potential variants.
Representative cDNA samples from subjects in Experiment 2 were PCR amplified and run on agarose gels for separation of bands. cDNA amplification was conducted from exons 2 through 4, to screen for alternative splicing. Gel bands smaller than full length were visualized on gels, and sent for Sanger sequencing. The subjects screened are shown in Supplementary Table 1.

2.4. Lymphoblast Cell Line (LCL) qPCR Biomarker (Experiment 4)
In addition to testing brain samples by qPCR (Experiment 2), 87 EBV-transformed lymphoblast cell lines (LCLs) from Costa Rica were tested for expression levels of the three MHC Class II genes via qPCR. Previously transformed cell lines were grown to confluency and harvested for RNA using Trizol. cDNA was synthesized as described above (Table 3, demographics).