Acknowledgements We thank Leslie Vosshall and Tyler Cutforth for providing cDNA libraries, staff of the core facilities at the Fred Hutchinson Cancer Research Center and the University of Washington's former Department of Molecular Biotechnology for sequencing assistance, Linda Buck and Michael Schlador for comments on the manuscript and Colin Pritchard for S16 primers. The data in this paper were analyzed in part through use of the Celera Discovery System™ and Celera Genomics' associated databases. This work was supported by NIH grant R01 DC04209. Figures and Tables Figure 1 Olfactory receptor genes whose expression in the mouse olfactory epithelium was confirmed in this study. Genes whose expression has been confirmed by our cDNA screen are colored blue on a phylogenetic tree of 1,107 intact mouse olfactory receptors. Genes whose expression was confirmed by PCR methods are colored red (genes listed in Additional data file 1 were confirmed by specific PCR of the cDNA library or reverse-transcribed RNA, and genes confirmed using the class I degenerate primer for RT-PCR are AY317681, AY317698, AY317700, AY317767, AY317773, AY317774, AY317797 and AY317923). Other olfactory receptors are colored gray, and a chemokine outgroup is colored black. Class I olfactory receptors are bracketed, and the remaining olfactory receptors are class II. Figure 2 The cDNA screen suggests different expression levels for different olfactory receptors. Distribution of number of cDNAs observed (dots) and expected (triangles, line) per olfactory receptor gene among 1,176 olfactory receptor cDNAs identified, based on a Poisson distribution. Figure 3 Differential expression levels among six olfactory receptor genes determined by quantitative PCR. (a) Expression levels of olfactory receptor genes can vary by almost 300-fold (for example, genes A and D). Relative expression levels of six selected olfactory receptor genes (A, AY318555; B, AY318107; C, AY318644; D, AY317365; E, AY317773; and F, AY317797) were determined in olfactory epithelium RNA samples from three mice. Expression levels for each gene were first determined relative to a standard curve made using mouse genomic DNA templates, and then values for each mouse were normalized so that a housekeeping gene, ribosomal S16, had a value of 1 (arbitrary units) (not shown). Error bars show one standard deviation (six replicate reactions). Genes E and F (AY317773 and AY317797) are class I olfactory receptors. Numbers of cDNAs observed in our screen are shown under each gene name. (b) Expression levels of each gene are similar, with some variation, among the three mice sampled. Graphs show pairwise comparisons between the three mice sampled, with relative expression levels (arbitrary units) in one mouse plotted along the x-axis and in a second on the y-axis. Figure 4 Olfactory receptors showing different expression levels. Different expression levels of one pair of olfactory receptors is due to different numbers of expressing cells and different transcript levels per cell. RNA in situ hybridization with digoxigenin-labeled probes for (a) gene A (AY318555) and (b) gene D (AY317365) on coronal sections of the olfactory turbinates of an adult mouse, shown at low magnification and inset (boxed) at high magnification. Endoturbinates II and III and ectoturbinate 3 are labeled in (b). Figure 5 Many olfactory receptor genes show alternate splicing. Distribution of the number of transcriptional isoforms observed for the 82 olfactory receptors for which we have identified at least four cDNAs. Figure 6 Sixty-two olfactory receptor cDNAs use splice sites within the coding region. The bar at the top represents an alignment of all olfactory receptor proteins, with transmembrane (TM) regions shaded gray and intracellular (IC) and extracellular (EC) loops in white. Above the bar, the jagged line plots information content [51] for each alignment position, with higher values representing residues conserved across more olfactory receptors. cDNAs with atypical splicing are plotted below, aligned appropriately to the consensus representation. Genbank accessions for each cDNA are shown on the right, and where more than one clone represents the same isoform, both names are given, but a composite sequence is drawn. Multiple isoforms from the same gene are grouped by gray background shading. Thick black lines represent cDNA sequence, and thin lines represent intronic sequence (with diagonal slash marks if not drawn to scale). The uppermost two cDNAs encode potentially functional olfactory receptors. A single cDNA drawn as white boxes (CB173065) is cloned into the vector in the reverse orientation. Introns that result in a frameshift relative to the olfactory receptor consensus are drawn as single dashed lines. The first in-frame methionine in the cDNA is marked with an 'M', and the first stop codon 5' to this methionine (if any) is marked with *. Most sequences are incomplete at the 3' end, as represented by paired dotted lines, although two sequences (CB174400 and CB174364), marked with '(A)n', contain the cDNA's poly(A) tail. The 'X' on sequence CB173500 marks an exon that does not align with genomic sequence near the rest of the gene or anywhere else in Celera's mouse genome sequence, and 'TM4' on sequence CB172879 notes an exon that matches to the reverse-complement of the fourth transmembrane domain of the next downstream olfactory receptor gene. For the two lowermost cDNAs, exon order in the cDNA clone is inconsistent with the corresponding genomic sequence, as represented by the curved intron lines. Table 1 Number of olfactory receptors in old (Release 12) and new (Release 13) Celera mouse genome assemblies Olfactory receptors in Release 12 mouse genome assembly [1] Olfactory receptors in Release 13 mouse genome assembly Total number of olfactory receptor sequences 1,468 1,490 Number of partial sequences (at end or gap in Celera scaffold) 262/1,468 (18%) 96/1,490 (6%) Number of full olfactory receptor sequences 1,206/1,468 (82%) 1,394/1,490 (94%) Interrupted by repeat sequence 134/1,206 (11%) 117/1,394 (8%) Contains frameshift or stop codon 206/1,206 (17%) 170/1,394 (12%) Intact ORF 866/1,206 (72%) 1,107/1,394 (79%) Intact class I 104/866 (12%) 124/1,107 (11%) Intact class II 762/866 (88%) 983/1,107 (89%) Table 2 Summary of cDNA screen for each library and probe. Library Probe Number of plaques screened (× 103) Number of sequences obtained Number of real olfactory receptor sequences True-positive rate Olfactory receptor clone frequency Number of olfactory receptor genes represented Embryonic OR5B_OR3B_40 640 58 37 64% 1/17,300 27 Adult OR5B_OR3B_40 2,850 1,450 1,138 78% 1/2,500 394 P24_P28_40 and TM3deg1_P28_45 200 23 3 13% 1/66,700 3 P26_P27_45 700 135 58 43% 1/12,100 35 P24_P28_45 200 39 22 56% 1/9,100 19 OR5B_OR3B_45 150 10 6 60% 1/250,000 5 Total 4,740 1,715 1,264 74% 1/3,800 419 Probe names comprise the names of the two primers and the annealing temperature used during PCR to generate the probes, separated by underscores