Discussion We have identified and sequenced 1,264 odorant receptor cDNAs from 419 olfactory receptor genes, confirming their expression in the olfactory epithelium. We have thus validated the similarity-based prediction of over one-third of the intact olfactory receptor genes annotated in the mouse genome [1,2], thereby vastly increasing the proportion of the family for which experimental evidence of olfactory function is available. We have not found cDNAs for all olfactory receptor genes or an even phylogenetic distribution of cDNAs, probably because the libraries and/or our screen are biased toward certain olfactory receptor subfamilies. Using RT-PCR with both degenerate and specific primers, we have confirmed olfactory expression of a number of additional olfactory receptors, bringing the total number of olfactory receptor genes verified in this study to 436, and ensuring that almost all phylogenetic clades have at least one representative with evidence of olfactory function. Results of our cDNA library screen suggested that some olfactory receptors are expressed at significantly higher levels than others. We used quantitative PCR to show that expression levels are indeed highly variable, with one olfactory receptor expressed at almost 300 times the level of another. Higher expression levels could be due to increased transcript number per cell and/or a greater number of olfactory neurons 'choosing' those genes. For one pair of genes we tested, expression level differences appear to be due to both factors. It would be interesting to collect data for additional genes to determine how the numbers of expressing cells and transcript levels per cell vary across the olfactory receptor family. Data from a number of previous studies also show that different olfactory receptor genes, or even copies of the same olfactory receptor transgene in different genomic locations are expressed in different numbers of cells [14,18,35], but do not address the issue of transcript level per cell. The fact that some genes are chosen more frequently, and when chosen may be expressed at higher levels per cell, is intriguing given each olfactory neuron's single-allele expression regime. The observation of unequal expression leads to a number of questions. It is known that each olfactory receptor is expressed in one of four zones of the olfactory epithelium [14,15]; do some zones choose from a smaller olfactory receptor sub-repertoire and thus express each olfactory receptor in a larger number of cells? We note that several apparently highly expressed olfactory receptors (gene A, this study, and MOR10 and MOR28 [36]) are expressed in zone 4 of the olfactory epithelium. Does activity-dependent neuronal competition [37] contribute to increased representation of the olfactory receptors that respond to common environmental odorants? Do the favored olfactory receptors have stronger promoter sequences? Are some olfactory receptor mRNAs more stable than others, leading to higher transcript levels per expressing cell? Are the favored olfactory receptors in more open chromatin conformation or more accessible genomic locations? Transcription of apparent 'singleton' olfactory receptor genes (0.5 Mb or more from the nearest other olfactory receptor gene) suggests that there is no absolute requirement for genomic clustering for an olfactory receptor to be transcribed, consistent with observations that small olfactory receptor transgenes can be expressed correctly when integrated outside native olfactory receptor clusters [35]. However, the high pseudogene count among singleton olfactory receptor genes (50%, versus 20% for clustered olfactory receptor genes) suggests that not all genomic locations are favorable for olfactory receptor gene survival, perhaps due to transcriptional constraints. It is also possible that evolutionary factors may be responsible for reduced pseudogene content of clustered olfactory receptors - gene conversion between neighboring olfactory receptors could rescue inactivating mutations in clustered genes, but not singletons. Before these questions about olfactory receptor gene choice can be answered, it will be important to measure expression levels of a larger number of genes, perhaps using an olfactory receptor gene microarray. Our study provides at least partial data about the upstream transcript structures of over 300 olfactory receptor genes. These data provide tentative locations of a large set of promoter regions, allowing computational searches for shared sequence motifs that might be involved in the intriguing transcriptional regulation of olfactory receptors. However, given that not all cDNAs are full-length clones, some of these candidates will not be true promoter regions. The 5' UTR sequences we obtained will also aid in the design of experimental probes, for example, for in situ hybridizations or to immobilize on an olfactory receptor microarray. One of the challenges of such an array will be to design unique probes with which to represent each gene. Often, the coding region of olfactory receptors is highly similar between recently duplicated genes. Many pairs of similar olfactory receptors show more sequence divergence in the UTRs than the protein-coding region (J.Y., unpublished observations). The UTRs would therefore make a better choice of sequence from which to design unique oligonucleotides to distinguish closely related olfactory receptor genes. Locations of these regions in genomic sequence are difficult to predict - our study provides 5' UTR sequences of 343 genes and the approximate 3' UTR length for 399 olfactory receptor genes. Probe design must also account for the multiple transcriptional isoforms observed for many olfactory receptors - depending on the question being asked, probes could be designed in shared sequence to determine the total level of all isoforms, or in unique exons to measure the level of each isoform separately. We find that the majority of the olfactory receptors, like most non-olfactory receptor genes [38,39], are transcribed as multiple isoforms, involving alternative splicing of 5' untranslated exons and alternate polyadenylation-site usage. The act of splicing itself may be important for efficient mRNA export from the nucleus [40] or to couple olfactory receptor coding regions with genomically distant promoters. The exact nature of the spliced transcript might be unimportant, such that several isoforms might be produced simply because multiple functional splice sites are available. Alternatively, the multiplicity of transcriptional isoforms might have functional significance, as UTRs may contain signals controlling mRNA stability, localization or degradation [41,42]. Our study shows that about 5% of olfactory receptor transcripts do not fit the current notion of olfactory receptor gene structure. Occasionally, an intron is spliced out of the 3' untranslated region. A number of cDNAs use splice sites within the olfactory receptor's ORF, meaning that their protein product is different from that predicted on the basis of genomic sequence alone. In two such cases, the transcript would encode a functional olfactory receptor, with the initiating methionine and first few amino acids encoded by an upstream exon, as has been observed previously for a subtelomeric human olfactory receptor gene [25]. Such within-ORF splicing might increase protein-coding diversity, although, given the small number of genes involved, splicing is unlikely to significantly affect the functional receptor repertoire. Most of the atypical splice forms we observe appear to encode non-functional transcripts, containing frameshifts or lacking a start codon or other functional residues conserved throughout the olfactory receptor family. These nonfunctional transcripts are probably aberrant by-products of the splicing system [43] that have not yet been degraded by RNA surveillance systems [40,41]. The neurons expressing these aberrant transcripts might also make normal transcripts for the same genes and thus produce a functional olfactory receptor. Alternatively, the unusual transcriptional regulation of olfactory receptors might ensure that only one splice isoform is expressed per cell (unlikely, but possible if an RNA-based feedback mechanism operates), thus condemning cells expressing these aberrant isoforms to be dysfunctional. We also observe transcripts from a small number of olfactory receptor pseudogenes, as previously described for three human olfactory receptor pseudogenes [26,44]. Although many fewer pseudogenes than intact genes were represented in our cDNA collection, some neurons in the olfactory epithelium evidently express disrupted olfactory receptors and thus might be unable to respond to odorants or to correctly innervate the olfactory bulb. Wang, Axel and coworkers have shown that an artificial transgenic olfactory receptor gene containing two nonsense mutations can support development of an olfactory neuron, but that pseudogene-expressing neurons fail to converge on a glomerulus in the olfactory bulb [45]. By analogy with an olfactory receptor deletion mutant [45], it is likely that most pseudogene-expressing neurons die or switch to express a different olfactory receptor gene, leaving a small number of pseudogene-expressing neurons in adult mice, but at greatly reduced levels compared to neurons expressing intact olfactory receptors.