Background The interaction of olfactory (or odorant) receptors with their odorant ligands is the first step in a signal transduction pathway that results in the perception of smell. The olfactory receptor gene family is one of the largest in the mammalian genome, comprising about 1,500 members in the mouse genome [1,2]. Olfactory receptors were originally identified in an elegant experiment based on the hypothesis that they would be seven-transmembrane-domain proteins encoded by a large, diverse gene family whose expression is restricted to the olfactory epithelium [3]. Subsequent studies have shown that some of these receptors do indeed respond to odorants and can confer that responsivity when expressed in heterologous cell types (for example [4]). Recent computational investigations have provided the almost complete human [5,6] and mouse [1,2] olfactory receptor-gene catalogs. However, the assignment of most of these genes as olfactory receptors is based solely on similarity to one of a relatively small number of experimentally confirmed mouse olfactory receptors or, worse, on similarity to a gene that in turn was defined as an olfactory receptor solely by similarity. While similarity-based genome annotation is a good initial method to identify genes and predict their function, in some cases it can be misleading, as genes of similar sequence can carry out different functions and be expressed in different tissues (for example, the sugar transporter gene family [7]). A small subset of olfactory receptors appears to be expressed in non-olfactory tissues, principally the testis [8], but also taste tissues [9], prostate [10], erythroid cells [11], notochord [12] and perhaps other tissues. Expression in the testis has led some investigators to suggest that a subset of olfactory receptors may function as spermatid chemoreceptors [8]. Recent studies of one human testis-expressed olfactory receptor indicate that it does indeed function in sperm chemotaxis [13]. Due to the paucity of experimental evidence of the olfactory function of most genes in the family and suggestions of extra-olfactory roles, we embarked on an olfactory receptor expressed sequence tag (EST) project to confirm olfactory epithelial expression of hundreds of mouse odorant receptor genes. Within the olfactory epithelium, individual olfactory receptor genes show an intriguing expression pattern. Each olfactory receptor is expressed in a subset of cells in one of four zones of the epithelium [14,15]. Furthermore, each olfactory neuron expresses only one allele [16] of a single olfactory receptor gene [17,18], and the remaining approximately 1,499 genes are transcriptionally inactive. While the mechanism ensuring singular expression is unknown, many hypotheses have been proposed [14,16,19]. In one model, somatic DNA recombination would bring one olfactory receptor gene into a transcriptionally active genomic configuration, as observed for the yeast mating type locus [20] and the mammalian immunoglobulin genes [21]. Alternatively, a second model invokes a combinatorial code of transcription factor binding sites unique to each gene. This is unlikely, however, as even olfactory receptor transgenes with identical upstream regions are expressed in different neurons [18]. In a third model, there would be a limiting quantity of transcription factors - the cell might contain a single transcriptional 'machine' that is capable of accommodating the promoter of only one olfactory receptor gene, similar to the expression site body used by African trypanosomes to ensure singular expression of only one set of variant surface glycoprotein genes [22]. Finally, in a fourth model, transcriptional activity at one stochastically chosen olfactory receptor allele might send negative feedback to repress activity of all other olfactory receptors and/or positive feedback to enhance its own expression. In the latter three models, some or all olfactory receptor genes might share transcription factor binding motifs, and in the first model, olfactory receptor genes might share a common recombination signal. In order to perform computational and experimental searches for such signals, it is important to have a better idea of the transcriptional start site of a large number of olfactory receptor genes. Our olfactory receptor EST collection provides 5' untranslated region (UTR) sequences for many genes and, therefore, a large dataset of candidate promoter regions. Olfactory receptor genes have an intronless coding region, simplifying both computational and experimental olfactory receptor identification. For a small number of olfactory receptors, gene structure has been determined. Additional 5' untranslated exons lie upstream of the coding region and can be alternatively spliced [19,23-26]. The 3' untranslated region is typically intronless. Exceptions to this stereotyped structure have been described for some human olfactory receptors, but are thought to be rare [25-27]. cDNA identification and RACE data have been used to determine gene structure for about 30 genes, see, for example, [19,23]. However, computational prediction of the location of 5' upstream exons and the extent of the 3' UTR from genomic sequence has been extremely difficult. A combination of splice site predictions and similarity to other olfactory receptors has allowed some investigators to predict 5' exon locations for around 15 genes [25,28]. Experimental validation shows that some, but not all, predictions are accurate [24,25]. The total number of olfactory receptors for which gene structure is known is vastly increased by our study. In this report, we describe the isolation and analysis of over 1,200 cDNAs representing 419 odorant receptor genes. We screened a mouse olfactory epithelium library with degenerate olfactory receptor probes and obtained 5' end sequences (ESTs) from purified cDNAs. These clones confirm olfactory expression of over 400 olfactory receptors, provide their gene structure, demonstrate that not all olfactory receptors are expressed at the same level and show that most olfactory receptor genes have multiple transcriptional isoforms.