# Term Identifier Odorant NN Odorant receptor expressed sequence tags demonstrate olfactory expression of over 400 genes, extensive alternate splicing and unequal expression levels sentence receptor NN expressed VBN sequence NN tags NNS demonstrate VBP olfactory JJ expression NN of IN over IN 400 CD genes NNS , , extensive JJ alternate JJ splicing NN and CC unequal JJ levels NNS "" sentence Background sentence Background NN " The olfactory receptor gene family is one of the largest in the mammalian genome." sentence The DT gene NN family NN is VBZ one CD the DT largest JJS in IN mammalian JJ genome NN . . Previous computational analyses have identified approximately 1,500 mouse olfactory receptors, but experimental evidence confirming olfactory function is available for very few olfactory receptors. sentence Previous JJ computational JJ analyses NNS have VBP identified VBN approximately RB 1,500 CD mouse NN receptors NNS but CC experimental JJ evidence NN confirming VBG function NN available JJ for IN very RB few JJ We therefore screened a mouse olfactory epithelium cDNA library to obtain olfactory receptor expressed sequence tags, providing evidence of olfactory function for many additional olfactory receptors, as well as identifying gene structure and putative promoter regions. sentence We PRP therefore RB screened VBD a DT epithelium NN cDNA NN library NN to TO obtain VB providing VBG many JJ additional JJ as RB well RB as IN identifying VBG structure NN putative JJ promoter NN regions NNS " Results" sentence Results NNS " We identified more than 1,200 odorant receptor cDNAs representing more than 400 genes." sentence identified VBD more JJR than IN 1,200 CD odorant JJ cDNAs NNS representing VBG Using real-time PCR to confirm expression level differences suggested by our screen, we find that transcript levels in the olfactory epithelium can differ between olfactory receptors by up to 300-fold. sentence Using VBG real JJ - HYPH time NN PCR NN confirm VB level NN differences NNS suggested VBN by IN our PRP$ screen NN we PRP find VBP that IN transcript NN can MD differ VB between IN up RB to IN 300-fold RB Differences for one gene pair are apparently due to both unequal numbers of expressing cells and unequal transcript levels per expressing cell. sentence Differences NNS pair NN are VBP apparently RB due IN both CC numbers NNS expressing VBG cells NNS per IN cell NN At least two-thirds of olfactory receptors exhibit multiple transcriptional variants, with alternative isoforms of both 5' and 3' untranslated regions. sentence At RB least RBS two CD thirds NNS exhibit VBP multiple JJ transcriptional JJ variants NNS with IN alternative JJ isoforms NNS 5 CD ' SYM 3 CD untranslated JJ Some transcripts (5%) utilize splice sites within the coding region, contrary to the stereotyped olfactory receptor gene structure. sentence Some DT transcripts NNS ( -LRB- % NN ) -RRB- utilize VBP splice NN sites NNS within IN coding VBG region NN contrary JJ stereotyped VBN Most atypical transcripts encode nonfunctional olfactory receptors, but can occasionally increase receptor diversity. sentence Most JJS atypical JJ encode VBP nonfunctional JJ occasionally RB increase VB diversity NN " Conclusions" sentence Conclusions NNS " Our cDNA collection confirms olfactory function of over one-third of the intact mouse olfactory receptors." sentence Our PRP$ collection NN confirms VBZ third NN intact JJ Most of these genes were previously annotated as olfactory receptors based solely on sequence similarity. sentence these DT were VBD previously RB annotated VBN based VBN solely RB on IN similarity NN Our finding that different olfactory receptors have different expression levels is intriguing given the one-neuron, one-gene expression regime of olfactory receptors. sentence finding NN different JJ intriguing JJ given VBN neuron NN regime NN We provide 5' untranslated region sequences and candidate promoter regions for more than 300 olfactory receptors, valuable resources for computational regulatory motif searches and for designing olfactory receptor microarrays and other experimental probes. sentence provide VBP sequences NNS candidate NN 300 CD valuable JJ resources NNS regulatory JJ motif NN searches NNS designing VBG microarrays NNS other JJ probes NNS interaction NN or CC odorant NN their PRP$ ligands NNS first JJ step NN signal NN transduction NN pathway NN that WDT results VBZ perception NN smell NN The olfactory receptor gene family is one of the largest in the mammalian genome, comprising about 1,500 members in the mouse genome [1,2]. sentence comprising VBG about IN members NNS [ -LRB- 1 CD , , 2 CD ] -RRB- Olfactory receptors were originally identified in an elegant experiment based on the hypothesis that they would be seven-transmembrane-domain proteins encoded by a large, diverse gene family whose expression is restricted to the olfactory epithelium [3]. sentence Olfactory JJ originally RB an DT elegant JJ experiment NN hypothesis NN they PRP would MD be VB seven CD transmembrane NN domain NN proteins NN encoded VBN large JJ diverse JJ whose WP$ restricted VBN Subsequent studies have shown that some of these receptors do indeed respond to odorants and can confer that responsivity when expressed in heterologous cell types (for example [4]). sentence Subsequent JJ studies NNS shown VBN some DT do VBP indeed RB respond VB odorants NNS confer VB that DT responsivity NN when WRB heterologous JJ types NNS example NN 4 CD Recent computational investigations have provided the almost complete human [5,6] and mouse [1,2] olfactory receptor-gene catalogs. sentence Recent JJ investigations NNS provided VBN almost RB complete JJ human JJ 6 CD catalogs NNS However, the assignment of most of these genes as olfactory receptors is based solely on similarity to one of a relatively small number of experimentally confirmed mouse olfactory receptors or, worse, on similarity to a gene that in turn was defined as an olfactory receptor solely by similarity. sentence However RB assignment NN most JJS relatively RB small JJ number NN experimentally RB confirmed VBN worse RBR turn NN was VBD defined VBN While similarity-based genome annotation is a good initial method to identify genes and predict their function, in some cases it can be misleading, as genes of similar sequence can carry out different functions and be expressed in different tissues (for example, the sugar transporter gene family [7]). sentence While IN annotation NN good JJ initial JJ method NN identify VB predict VBP cases NNS it PRP misleading JJ similar JJ carry VB out RP functions NNS tissues NNS sugar NN transporter NN 7 CD " A small subset of olfactory receptors appears to be expressed in non-olfactory tissues, principally the testis [8], but also taste tissues [9], prostate [10], erythroid cells [11], notochord [12] and perhaps other tissues." sentence A DT subset NN appears VBZ non-olfactory JJ principally RB testis NN 8 CD also RB taste NN 9 CD prostate NN 10 CD erythroid JJ 11 CD notochord NN 12 CD perhaps RB Expression in the testis has led some investigators to suggest that a subset of olfactory receptors may function as spermatid chemoreceptors [8]. sentence Expression NN has VBZ led VBN investigators NNS suggest VB may MD function VB spermatid JJ chemoreceptors NNS Recent studies of one human testis-expressed olfactory receptor indicate that it does indeed function in sperm chemotaxis [13]. sentence indicate VBP does VBZ sperm NN chemotaxis NN 13 CD Due to the paucity of experimental evidence of the olfactory function of most genes in the family and suggestions of extra-olfactory roles, we embarked on an olfactory receptor expressed sequence tag (EST) project to confirm olfactory epithelial expression of hundreds of mouse odorant receptor genes. sentence Due IN paucity NN suggestions NNS extra-olfactory JJ roles NNS embarked VBD tag NN EST NN project NN epithelial JJ hundreds NNS " Within the olfactory epithelium, individual olfactory receptor genes show an intriguing expression pattern." sentence Within IN individual JJ show VBP pattern NN Each olfactory receptor is expressed in a subset of cells in one of four zones of the epithelium [14,15]. sentence Each DT four CD zones NNS 14 CD 15 CD Furthermore, each olfactory neuron expresses only one allele [16] of a single olfactory receptor gene [17,18], and the remaining approximately 1,499 genes are transcriptionally inactive. sentence Furthermore RB each DT expresses VBZ only RB allele NN 16 CD single JJ 17 CD 18 CD remaining NN 1,499 CD transcriptionally RB inactive JJ While the mechanism ensuring singular expression is unknown, many hypotheses have been proposed [14,16,19]. sentence mechanism NN ensuring VBG singular JJ unknown JJ hypotheses NNS been VBN proposed VBN 19 CD In one model, somatic DNA recombination would bring one olfactory receptor gene into a transcriptionally active genomic configuration, as observed for the yeast mating type locus [20] and the mammalian immunoglobulin genes [21]. sentence In IN model NN somatic JJ DNA NN recombination NN bring VB into IN active JJ genomic JJ configuration NN observed VBN yeast NN mating NN type NN locus NN 20 CD immunoglobulin NN 21 CD Alternatively, a second model invokes a combinatorial code of transcription factor binding sites unique to each gene. sentence Alternatively RB second JJ invokes VBZ combinatorial JJ code NN transcription NN factor NN binding NN unique JJ This is unlikely, however, as even olfactory receptor transgenes with identical upstream regions are expressed in different neurons [18]. sentence This DT unlikely JJ however RB even RB transgenes NNS identical JJ upstream JJ neurons NNS In a third model, there would be a limiting quantity of transcription factors - the cell might contain a single transcriptional 'machine' that is capable of accommodating the promoter of only one olfactory receptor gene, similar to the expression site body used by African trypanosomes to ensure singular expression of only one set of variant surface glycoprotein genes [22]. sentence third JJ there EX limiting VBG quantity NN factors NNS - : might MD contain VB ' `` machine NN ' '' capable JJ accommodating VBG site NN body NN used VBN African JJ trypanosomes NNPS ensure VB set NN variant JJ surface NN glycoprotein NN 22 CD Finally, in a fourth model, transcriptional activity at one stochastically chosen olfactory receptor allele might send negative feedback to repress activity of all other olfactory receptors and/or positive feedback to enhance its own expression. sentence Finally RB fourth JJ activity NN at IN stochastically RB chosen VBN send VB negative JJ feedback NN repress VB all DT / HYPH positive JJ enhance VB its PRP$ own JJ In the latter three models, some or all olfactory receptor genes might share transcription factor binding motifs, and in the first model, olfactory receptor genes might share a common recombination signal. sentence latter JJ three CD models NNS share VB motifs NNS common JJ In order to perform computational and experimental searches for such signals, it is important to have a better idea of the transcriptional start site of a large number of olfactory receptor genes. sentence order NN perform VB such JJ signals NNS important JJ have VB better JJR idea NN start NN Our olfactory receptor EST collection provides 5' untranslated region (UTR) sequences for many genes and, therefore, a large dataset of candidate promoter regions. sentence provides VBZ UTR NN dataset NN " Olfactory receptor genes have an intronless coding region, simplifying both computational and experimental olfactory receptor identification." sentence intronless JJ simplifying VBG identification NN For a small number of olfactory receptors, gene structure has been determined. sentence For IN determined VBN Additional 5' untranslated exons lie upstream of the coding region and can be alternatively spliced [19,23-26]. sentence Additional JJ exons NNS lie VBP upstream RB alternatively RB spliced VBN 23 CD - SYM 26 CD The 3' untranslated region is typically intronless. sentence typically RB Exceptions to this stereotyped structure have been described for some human olfactory receptors, but are thought to be rare [25-27]. sentence Exceptions NNS this DT stereotyped JJ described VBN thought VBN rare JJ 25 CD 27 CD cDNA identification and RACE data have been used to determine gene structure for about 30 genes, see, for example, [19,23]. sentence RACE NN data NNS determine VB about RB 30 CD see VB However, computational prediction of the location of 5' upstream exons and the extent of the 3' UTR from genomic sequence has been extremely difficult. sentence prediction NN location NN extent NN from IN extremely RB difficult JJ A combination of splice site predictions and similarity to other olfactory receptors has allowed some investigators to predict 5' exon locations for around 15 genes [25,28]. sentence combination NN predictions NNS allowed VBN predict VB exon NN locations NNS around RB 28 CD Experimental validation shows that some, but not all, predictions are accurate [24,25]. sentence Experimental JJ validation NN shows VBZ not RB accurate JJ 24 CD The total number of olfactory receptors for which gene structure is known is vastly increased by our study. sentence total JJ which WDT known VBN vastly RB increased VBN study NN " In this report, we describe the isolation and analysis of over 1,200 cDNAs representing 419 odorant receptor genes." sentence report NN describe VBP isolation NN analysis NN 419 CD We screened a mouse olfactory epithelium library with degenerate olfactory receptor probes and obtained 5' end sequences (ESTs) from purified cDNAs. sentence degenerate JJ obtained VBN end NN ESTs NNS purified VBN These clones confirm olfactory expression of over 400 olfactory receptors, provide their gene structure, demonstrate that not all olfactory receptors are expressed at the same level and show that most olfactory receptor genes have multiple transcriptional isoforms. sentence These DT clones NNS confirm VBP same JJ " We have isolated 1,264 olfactory receptor cDNA clones, which together confirm the olfactory epithelial expression of 419 annotated olfactory receptor genes." sentence isolated VBN 1,264 CD together RB We used low-stringency hybridization with degenerate olfactory receptor DNA probes to screen around 4,100,000 plaque-forming units (pfu) of an adult mouse olfactory epithelium cDNA library and around 640,000 pfu of an embryonic olfactory epithelium library. sentence used VBD low JJ stringency NN hybridization NN screen VB 4,100,000 CD plaque NN forming VBG units NNS pfu NNS adult JJ 640,000 CD embryonic JJ We obtained sequences from 1,715 hybridization-positive cDNAs following secondary screens to isolate single clones. sentence obtained VBD 1,715 CD following VBG secondary JJ screens NNS isolate VB Of these clones, 1,264 yielded olfactory receptor-containing sequences. sentence Of IN yielded VBD containing VBG The 26% false-positive rate is a consequence of using low-stringency hybridization to obtain maximal sensitivity. sentence false JJ rate NN consequence NN using VBG maximal JJ sensitivity NN Continuing the screen would have resulted in cDNAs from additional olfactory receptors, but we reached a point of limiting returns: our final batch of 45 olfactory receptor-positive sequences represented 33 different olfactory receptors, of which only four had not been encountered previously in our screen. sentence Continuing VBG resulted VBN reached VBD point NN returns NNS : : final JJ batch NN 45 CD represented VBD 33 CD had VBD encountered VBN " Sequence analysis shows that the libraries are of high quality." sentence Sequence NN libraries NNS high JJ quality NN Firstly, directional cloning was successful: only eight out of 1,430 cDNA sequences with any protein homology matched that protein on the reverse strand. sentence Firstly RB directional JJ cloning NN successful JJ eight CD out IN 1,430 CD any DT protein NN homology NN matched VBD reverse JJ strand NN Secondly, genomic contamination is rare: when the 83 olfactory receptor-containing sequences that had a 5' UTR of 400 bp or longer were aligned to genomic sequence, 80 spliced across at least one intron, leaving a maximum of three clones (3.6%) that potentially represent genomic contamination of the libraries. sentence Secondly RB contamination NN 83 CD bp NNS longer JJR aligned VBN 80 CD across IN at RB intron NN leaving VBG maximum NN 3.6 CD potentially RB represent VBP Thirdly, most clones are of a reasonable length: although we did not determine whether clones are full-length, 881 of 1,264 (70%) olfactory receptor cDNAs contain the gene's start codon and at least some 5' UTR sequence. sentence Thirdly RB reasonable JJ length NN although IN did VBD whether IN full JJ 881 CD 70 CD contain VBP 's POS codon NN " In order to match cDNAs to their genomic counterparts, first we updated our catalog of mouse olfactory receptor genes [1] based on Celera's most recent genome assembly (Release 13) [29]." sentence match VB counterparts NNS first RB updated VBD catalog NN Celera NNP most RBS recent JJ assembly NN Release NN 29 CD Previous reports of the mouse olfactory receptor repertoire [1,2] were based on the Release 12 assembly. sentence reports NNS repertoire NN Release 13 consists of fewer, longer scaffold sequences containing fewer, smaller gaps than Release 12. sentence consists VBZ fewer JJR scaffold NN smaller JJR gaps NNS Using the BLAST-based methods detailed previously [1], we identified 1,490 olfactory receptor sequences in the new assembly, including 1,107 intact olfactory receptor genes (compared to 866 intact olfactory receptors in the old assembly) reflecting the reduced sequence error rate and increased coverage of the new assembly (Table 1). sentence BLAST NN methods NNS detailed VBN 1,490 CD new JJ including VBG 1,107 CD compared VBN 866 CD old JJ reflecting VBG reduced VBN error NN coverage NN Table NN We created a local database of genomic sequences including all olfactory receptor loci and 0.5 Mb flanking sequences (if available) and compared each cDNA sequence to this 'olfactory subgenome' database using sim4 [30]. sentence created VBD local JJ database NN loci NNS 0.5 CD Mb NN flanking NN if IN subgenome NN sim4 NN " cDNAs were assigned to individual genes based on their best match to an olfactory receptor coding region or its upstream region (see Materials and methods)." sentence assigned VBN best JJS match NN coding NN Materials NNS Of the 1,264 olfactory receptor cDNAs, 1,176 matched a total of 419 olfactory receptor genes; the remaining cDNAs either matched an olfactory receptor below our 96% nucleotide identity threshold or had ambiguous matches encompassing more than one olfactory receptor gene region (see below). sentence 1,176 CD total NN ; : remaining VBG either CC below IN 96 CD nucleotide NN identity NN threshold NN ambiguous JJ matches NNS encompassing VBG below RB class NN I CD primer NN broadens VBZ phylogenetic JJ distribution NN " Previous analyses of the mammalian olfactory receptor family define two major phylogenetic clades, referred to as class I and II olfactory receptors, and suggest that class I olfactory receptors are more similar to fish olfactory receptors than are class IIs [5]." sentence define VBZ major JJ clades NNS referred VBN II CD suggest VBP more RBR fish NN IIs NNS Figure 1 illustrates the phylogenetic diversity of our cDNA collection, showing that we have confirmed expression of at least one olfactory receptor gene in each major clade of the class II olfactory receptor genes, or 391 out of 983 (40%) of all intact class II olfactory receptor genes where full-length genomic sequence data are available (blue branches). sentence Figure NN illustrates VBZ showing VBG clade NN 391 CD 983 CD 40 CD where WRB blue JJ branches NNS The screen thus appears relatively unbiased in its coverage of class II olfactory receptors. sentence thus RB unbiased JJ However, our random screen provided cDNAs for only two out of 124 intact, full-length class I olfactory receptors. sentence random JJ provided VBD 124 CD In an attempt to broaden the phylogenetic coverage of our hybridization screen, we used additional degenerate probes on the adult library and screened an embryonic library (Table 2). sentence attempt NN broaden VB These experiments did not increase the diversity of clones identified (not shown). sentence experiments NNS " This severe class I underrepresentation could be due to experimental bias - a consequence of using degenerate primers to create our hybridization probe." sentence severe JJ underrepresentation NN could MD bias NN primers NNS create VB probe NN Alternatively, class I genes might be expressed at extremely low levels in the olfactory epithelium. sentence In order to determine whether class I olfactory receptors are expressed in the olfactory epithelium, we designed a reverse-strand degenerate primer to recognize a motif in transmembrane domain 7 (PP{V/M/A/T}{F/L/I/M}NP) enriched among class I olfactory receptor sequences. sentence designed VBD recognize VB PP NN { -LRB- V NN M NN A NN T NN } RBR F NN L NN I NN NP NN enriched VBN among IN Most of the motif is shared among all olfactory receptors, but the first proline residue (at the primer's 3' end) is found in 121 out of 124 (98%) intact class I genes compared to only 37 out of 983 (4%) intact class II genes. sentence shared VBN proline NN residue NN found VBN 121 CD 98 CD 37 CD When combined with another olfactory receptor degenerate primer, P26 [17], this primer preferentially amplifies class I olfactory receptors from mouse genomic DNA: of 33 sequenced, cloned PCR products, 17 represented seven different class I olfactory receptors, six represented three different class II olfactory receptors, and ten represented five different non-olfactory receptor contaminants. sentence When WRB combined VBN another DT P26 NN preferentially RB amplifies VBZ sequenced JJ cloned JJ products NNS six CD ten CD five CD contaminants NNS Degenerate PCR, cloning, and sequencing from reverse-transcribed olfactory epithelium RNA showed that at least seven class I olfactory receptors are expressed, as well as one additional class II gene (colored red in Figure 1). sentence Degenerate JJ sequencing NN transcribed VBN RNA NN showed VBD colored VBN red JJ However, no products could be obtained from the adult or the fetal olfactory epithelium cDNA libraries using the class I primer, suggesting that the libraries contain very low levels of class I olfactory receptors. sentence no DT fetal JJ suggesting VBG We also confirmed expression of nine additional olfactory receptors (three class I and six class II, colored red in Figure 1) from subclades that were poorly represented in our cDNA screen using gene-specific primer pairs to amplify cDNA library or reverse-transcribed RNA templates. sentence confirmed VBD nine CD subclades NNS poorly RB represented VBN specific JJ pairs NNS amplify VB templates NNS " For two of the class I genes we had shown to be expressed, we determined relative transcript levels using quantitative RT-PCR (see below)." sentence determined VBD relative JJ quantitative JJ RT NN Expression levels were similar to those observed for genes that were represented in our cDNA collection, suggesting that class I olfactory receptors are not under-represented in the olfactory epithelium, and that the dearth of class I cDNAs in our screen is likely to be due to bias in the libraries and/or hybridization probes. sentence those DT under RB dearth NN likely JJ higher JJR others NNS " Our cDNA screen suggests that some olfactory receptor genes are expressed at higher levels than others." sentence suggests VBZ If all olfactory receptor genes were expressed at equal levels, and our screen and library were unbiased in their coverage of the class II olfactory receptors, the number of cDNAs detected per class II olfactory receptor should follow a Poisson distribution, calculated based on the assumption that all 983 intact class II olfactory receptors have an equal chance of being represented in the screen, but that class I olfactory receptors and pseudogenes cannot be found (Figure 2). sentence If IN equal JJ detected VBN should MD follow VB Poisson NNP calculated VBN assumption NN chance NN being VBG pseudogenes NNS We calculate a low probability (approximately one in 28) that we would observe any gene with at least eight matching cDNAs in the set of 1,176 cDNAs we assigned to single olfactory receptor sequences. sentence calculate VBP probability NN observe VB matching VBG assigned VBD However, for 17 olfactory receptors, we found ten or more matching cDNAs, suggesting that they might be expressed at higher levels than other olfactory receptor genes (Figure 2). sentence found VBD matching NN The two genes for which we found most cDNAs (AY318726/MOR28 and AY318727/MOR10) are genomically adjacent and in the well-studied olfactory receptor cluster next to the T-cell receptor α/δ locus [18,31]. sentence AY318726 NN MOR28 NN AY318727 NN MOR10 NN genomically RB adjacent JJ studied VBN cluster NN next JJ α NN δ NN 31 CD " Quantitative RT-PCR of six olfactory receptors confirms that expression levels do indeed vary considerably between genes." sentence Quantitative JJ vary VB considerably RB We used quantitative (real-time) PCR to measure olfactory epithelium transcript levels of six olfactory receptor genes and the ribosomal S16 gene in three mice of the same inbred strain (Figure 3). sentence measure VB ribosomal JJ S16 NN mice NNS inbred JJ strain NN These genes include two olfactory receptors with more than 20 matching cDNAs, two with one or two matching cDNAs and two class I olfactory receptors with no matching cDNAs. sentence include VBP matching JJ In these assays, we measure transcript level per genomic copy of the gene by comparing how well a gene-specific primer pair amplifies reverse-transcribed RNA, relative to a standard curve of amplification of mouse genomic DNA. sentence assays NNS measure VBP copy NN comparing VBG how WRB standard JJ curve NN amplification NN We find that expression levels can vary by almost 300-fold between genes (for example, genes A and D, Figure 3). sentence 300-fold JJ D NN However, cDNA numbers are not a good indicator of expression level, a discrepancy that is likely to be due to bias in the screen (we used degenerate primers to make the probes, which will favor some genes over others) and in the libraries (oligo-dT priming will favor genes with shorter 3' UTRs). sentence indicator NN discrepancy NN make VB will MD favor VB oligo NN dT NN priming NN shorter JJR UTRs NNS For example, we observe large expression differences in all three mice between two genes for which similar numbers of cDNAs were found (genes A and B, Figure 3), and conversely, similar expression levels between two genes with a ten-fold difference in number of cDNAs found (genes B and C, Figure 3). sentence observe VBP all PDT B NN conversely RB ten-fold JJ difference NN C NN Expression levels are mostly consistent between different mice: we find similar expression-level differences between olfactory receptor genes in all three mice examined (that is, the rank order of the six genes is similar among the three mice), although there is variation in expression level of some genes between mice (for example, gene E, Figure 3). sentence mostly RB consistent JJ examined VBN that RB is RB rank NN variation NN E NN " In situ hybridization (Figure 4) shows that increased numbers of expressing cells account for some, but not all, of the difference in transcript levels between two of the genes tested by real-time PCR (genes A and D in Figure 3)." sentence In FW situ FW account VBP tested VBN We hybridized alternate coronal serial sections spanning an entire olfactory epithelium of a young mouse (P6) with probes for gene A and gene D. sentence hybridized VBD coronal JJ serial JJ sections NNS spanning VBG entire JJ young JJ P6 NN Southern blot and BLAST analyses show that both probes are likely to hybridize to their intended target genes and no others (not shown). sentence Southern NNP blot NN both DT hybridize VB intended VBN target NN Gene A is expressed in zone 4 of the epithelium according to the nomenclature of Sullivan et al. [32] (Figure 4a). sentence Gene NN zone NN according VBG nomenclature NN Sullivan NNP et FW al. FW 32 CD 4a CD The expression pattern of gene D does not correspond to any of the four 'classical' olfactory epithelial zones [14,15,32]: positive cells are found in regions of endoturbinates II and III and ectoturbinate 3, resembling the expression pattern seen previously for the OR37 subfamily and ORZ6 olfactory receptors [33,34] (Figure 4b). sentence correspond VB classical JJ endoturbinates NNS III CD ectoturbinate NN resembling VBG seen VBN OR37 NN subfamily NN ORZ6 NN 34 CD 4b CD Counting the total number of positive cells in alternate sections across the entire epithelium, we find that gene A is expressed in 2,905 cells, about 12 times more cells than gene D, which is expressed in a total of 249 cells. sentence Counting VBG 2,905 CD times NNS 249 CD This 12-fold difference in numbers of expressing cells does not account for the almost 300-fold difference in RNA levels observed by real-time PCR, implying that the transcript level per expressing cell for gene A is about 25 times higher than transcript level in each expressing cell for gene D. sentence 12-fold JJ account VB implying VBG We note that hybridization intensities per positive neuron appear stronger for gene A than gene D after comparable exposure times, in accordance with the idea that transcript levels are higher per cell. sentence note VBP intensities NNS appear VBP stronger JJR after IN comparable JJ exposure NN accordance NN Thus, we suggest that expression in more cells and in higher levels per cell together account for the almost 300-fold higher olfactory epithelial RNA levels of gene A relative to gene D (Figure 3). sentence Thus RB Most RBS several JJ " Our cDNA collection reveals that at least two thirds of the olfactory receptors sampled show alternative splicing of their 5' untranslated exons." sentence reveals VBZ sampled VBN Using a custom script to process sim4 alignments of cDNA and genomic sequences, we find two to eight different splice forms for 85 (45%) of the 191 genes for which we have had some opportunity to observe alternate splicing (that is, a minimum of two cDNAs, at least one of which is spliced), and 55 (67%) of the 82 genes for which we have four or more cDNAs (and thus a higher chance of observing any alternate splicing) (Figure 5). sentence custom NN script NN process VB alignments NNS forms NNS 85 CD 191 CD had VBN opportunity NN minimum NN 55 CD 67 CD 82 CD observing VBG These alternative splice events are almost all restricted to the 5' UTR and include exon skipping and alternate splice-donor and -acceptor use. sentence events NNS all RB skipping NN donor NN acceptor NN use NN " At least half of the olfactory receptors represented in our cDNA collection utilize more than one polyadenylation site, resulting in alternative 3' UTR isoforms." sentence half NN polyadenylation NN resulting VBG We have crudely estimated 3' UTR size for 1,169 cDNA clones by combining approximate insert size information with 5' sequence data. sentence crudely RB estimated VBN size NN 1,169 CD combining VBG approximate JJ insert NN information NN More than one 3' UTR isoform is predicted for 43 of the 77 (56%) genes for which there are at least four cDNAs with 3' UTR size information. sentence More JJR isoform NN predicted VBN 43 CD 77 CD 56 CD We confirmed the alternative polyadenylation isoforms of four out of five selected genes by sequencing the 3' end of 14 cDNA clones. sentence selected VBN sequencing VBG These 14 sequences also revealed one cDNA where the poly(A) tail was added 27 bp before the stop codon, and another where an intron was spliced out of the 3' UTR, contrary to the conventional stereotype of olfactory receptor gene structure. sentence revealed VBD poly NN tail NN added VBN bp NN before IN stop NN conventional JJ stereotype NN unusual JJ " We identified 62 cDNAs (5% of all olfactory receptor clones) from 38 intact olfactory receptors and one olfactory receptor pseudogene where a splice site within the protein-coding region is used." sentence 62 CD 38 CD pseudogene NN For two genes (top two cDNAs, Figure 6), the predicted protein appears to be an intact olfactory receptor with three or ten amino acids, including the initiating methionine, contributed by an upstream exon. sentence top JJ amino NN acids NNS initiating VBG methionine NN contributed VBN A similar gene structure was described previously for a human olfactory receptor [25]. sentence One of these two mouse genes has no start codon in its otherwise intact main coding exon. sentence One CD otherwise RB main JJ The unusual splicing thus rescues what would otherwise be a dysfunctional gene. sentence rescues VBZ what WP dysfunctional JJ In most cases (60 out of 62 cDNAs), the unusual transcript appears to be an aberrant splice form - the transcript would probably not encode a functional protein because the splice introduces a frameshift or removes conserved functional residues (Figure 6). sentence 60 CD aberrant JJ form NN probably RB encode VB functional JJ because IN introduces VBZ frameshift NN removes VBZ conserved VBN residues NNS For two clones (bottom two cDNAs, Figure 6), exon order in the cDNA clone is inconsistent with the corresponding genomic sequence. sentence bottom JJ clone NN inconsistent JJ corresponding JJ It is difficult to imagine what kind of cloning artefact resulted in these severely scrambled cDNAs: we suggest that they derive from real but rare transcripts. sentence It PRP imagine VB what WDT kind NN cloning VBG artefact NN resulted VBD severely RB scrambled VBN derive VBP However, their low frequency in our cDNA collection suggests that splicing contrary to genomic organization does not contribute significantly to the olfactory receptor transcript repertoire. sentence frequency NN organization NN contribute VB significantly RB For 21 of the 26 genes for which unusually spliced cDNAs were found, we also observe an alternative ('normal') isoform that does not use splice sites within the coding region. sentence unusually RB normal JJ use VB (For the remaining 13 of the 3' genes showing odd splicing, we have identified only one cDNA so have not determined whether normal isoforms are present.) sentence odd JJ so CC present JJ " We were intrigued both by previous reports of splicing of human olfactory receptors near the major histocompatibility complex (MHC) cluster, where several genes splice over long distances to a common upstream exon [26,27] and by the idea that olfactory receptor transcriptional control could be achieved by DNA recombination mechanisms, perhaps with the result that transcripts would contain some sequence from another locus." sentence intrigued VBN previous JJ near IN histocompatibility NN complex NN MHC NN splice VBP long JJ distances NNS control NN achieved VBN mechanisms NNS result NN We therefore verified that the entire sequence of each olfactory receptor EST matches the corresponding gene's genomic 'territory' (defined for this purpose as from 1 kb after the preceding gene to 1 kb after the gene's stop codon). sentence verified VBD matches VBZ corresponding VBG territory NN purpose NN kb NN preceding VBG We found no cDNAs where introns encompassed other olfactory receptor genes, as reported for olfactory receptors in the human MHC region [26,27]. sentence introns NNS encompassed VBD reported VBN Six cDNAs do extend further than a single gene's 'territory' and appear not to be artifacts of the sequencing or analysis process. sentence Six CD extend VB further RBR artifacts NNS process NN In each of these cases, the clones use splice sites within the 3' UTR and thus extend further than the arbitrary 1 kb downstream of the stop codon. sentence use VBP extend VBP arbitrary JJ downstream JJ Five of these six cDNAs also use splice-donor sites within the coding region and encode disrupted olfactory receptors (Figure 6). sentence Five CD disrupted VBN In the sixth cDNA, a 2.6-kb intron is spliced out of the 3' UTR, leaving the coding region intact. sentence sixth JJ 2.6 CD " If olfactory receptor transcriptional control is achieved by DNA recombination, the beginning of each transcript might derive from a donated promoter region, with the rest of the transcript coming from the native ORF-containing locus." sentence beginning NN derive VB donated VBN rest NN coming VBG native JJ ORF NN In order to examine the recombination hypothesis, we analyzed 115 cDNA clones for which sim4 failed to align 20 bp or more to the corresponding genomic locus. sentence examine VB analyzed VBD 115 CD failed VBD align VB In most cases, the missing sequence was explained by gaps in the genomic sequence or by matches that fell below our percent identity-based cutoff for reporting matches. sentence missing JJ explained VBN fell VBD percent NN cutoff NN reporting VBG For three cDNAs (from three different olfactory receptors), we found that the missing piece of sequence matched elsewhere in the genome. sentence piece NN matched VBN elsewhere RB Comparison with the public mouse genome assembly confirmed the distant matches. sentence Comparison NN public JJ distant JJ With such a small number of cDNAs exhibiting a possible sign of DNA recombination (a sign that could also be interpreted as chimeric cDNA clones), we conclude that such rearrangement is unlikely to occur. sentence With IN such PDT exhibiting VBG possible JJ sign NN interpreted VBN chimeric JJ conclude VBP rearrangement NN occur VB However, the possibility remains that DNA recombination is responsible for olfactory receptor transcriptional regulation, with the donated region contributing only promoter sequences but no part of the transcript. sentence possibility NN remains VBZ responsible JJ regulation NN contributing VBG part NN Both CC unclustered JJ " We were interested in whether olfactory receptors need to be part of a cluster in the genome in order to be transcribed, or if the clustered genomic organization of olfactory receptors is simply a consequence of the fact that local duplication is the major mechanism for expanding the gene family [1]." sentence interested JJ need VBP clustered JJ simply RB fact NN duplication NN expanding VBG 'Singleton' olfactory receptors (defined as full-length olfactory receptors without another olfactory receptor within 0.5 Mb) are more often pseudogenes than are olfactory receptors in clusters (8 out of 16 versus 271 out of 1,358; χ2 = 8.8, P < 0.005). sentence Singleton NN without IN Mb NNS often RB clusters NNS versus CC 271 CD 1,358 CD χ2 NN = SYM 8.8 CD P NN < SYM 0.005 CD Of the eight intact singleton olfactory receptors, two have matching cDNAs in our collection, a similar proportion as found for olfactory receptors in clusters, showing that clustering is not an absolute requirement for olfactory receptor expression. sentence singleton NN proportion NN clustering VBG absolute JJ requirement NN However, it is possible these two expressed singleton genes are part of 'extended' olfactory receptor clusters - their nearest olfactory receptor neighbors are 1.7 Mb and 2.6 Mb away, respectively. sentence extended VBN nearest JJS neighbors NNS 1.7 CD away RB respectively RB " We also find that some olfactory receptor pseudogenes are expressed, albeit with a lower probability than intact olfactory receptors." sentence albeit IN lower JJR Considering the 1,392 olfactory receptor gene sequences for which reliable full-length data are available, 15 out of 285 (5%) apparent pseudogenes are represented in our cDNA collection, compared to 393 out of 1,107 (36%) intact olfactory receptors. sentence Considering VBG 1,392 CD reliable JJ 285 CD apparent JJ 393 CD 36 CD However, three of these 15 'expressed pseudogenes' are intact genes in the public mouse genome sequence. sentence expressed JJ The defects in Celera's version of these genes may be due to sequencing errors or true polymorphism. sentence defects NNS version NN errors NNS true JJ polymorphism NN Publicly available mouse sequence confirms that 11 of the 12 remaining expressed pseudogenes are indeed pseudogenes. sentence Publicly RB No public sequence matches the 12th 'expressed pseudogene' with 99% identity or more. sentence No DT 12th JJ 99 CD sequenced VBN We have thus validated the similarity-based prediction of over one-third of the intact olfactory receptor genes annotated in the mouse genome [1,2], thereby vastly increasing the proportion of the family for which experimental evidence of olfactory function is available. sentence validated VBN thereby RB increasing VBG We have not found cDNAs for all olfactory receptor genes or an even phylogenetic distribution of cDNAs, probably because the libraries and/or our screen are biased toward certain olfactory receptor subfamilies. sentence biased JJ toward IN certain JJ subfamilies NNS Using RT-PCR with both degenerate and specific primers, we have confirmed olfactory expression of a number of additional olfactory receptors, bringing the total number of olfactory receptor genes verified in this study to 436, and ensuring that almost all phylogenetic clades have at least one representative with evidence of olfactory function. sentence bringing VBG verified VBN 436 CD representative NN " Results of our cDNA library screen suggested that some olfactory receptors are expressed at significantly higher levels than others." sentence suggested VBD We used quantitative PCR to show that expression levels are indeed highly variable, with one olfactory receptor expressed at almost 300 times the level of another. sentence show VB highly RB variable JJ Higher expression levels could be due to increased transcript number per cell and/or a greater number of olfactory neurons 'choosing' those genes. sentence Higher JJR greater JJR choosing VBG For one pair of genes we tested, expression level differences appear to be due to both factors. sentence tested VBD It would be interesting to collect data for additional genes to determine how the numbers of expressing cells and transcript levels per cell vary across the olfactory receptor family. sentence interesting JJ collect VB vary VBP Data from a number of previous studies also show that different olfactory receptor genes, or even copies of the same olfactory receptor transgene in different genomic locations are expressed in different numbers of cells [14,18,35], but do not address the issue of transcript level per cell. sentence Data NNS copies NNS transgene NN 35 CD address VB issue NN The fact that some genes are chosen more frequently, and when chosen may be expressed at higher levels per cell, is intriguing given each olfactory neuron's single-allele expression regime. sentence frequently RB The observation of unequal expression leads to a number of questions. sentence observation NN leads VBZ questions NNS It is known that each olfactory receptor is expressed in one of four zones of the olfactory epithelium [14,15]; do some zones choose from a smaller olfactory receptor sub-repertoire and thus express each olfactory receptor in a larger number of cells? sentence do VB choose VBP sub-repertoire NN express VB larger JJR ? . We note that several apparently highly expressed olfactory receptors (gene A, this study, and MOR10 and MOR28 [36]) are expressed in zone 4 of the olfactory epithelium. sentence Does activity-dependent neuronal competition [37] contribute to increased representation of the olfactory receptors that respond to common environmental odorants? sentence Does VBZ dependent JJ neuronal JJ competition NN representation NN respond VBP environmental JJ Do the favored olfactory receptors have stronger promoter sequences? sentence Do VBP favored JJ Are some olfactory receptor mRNAs more stable than others, leading to higher transcript levels per expressing cell? sentence Are VBP mRNAs NNS stable JJ leading VBG Are the favored olfactory receptors in more open chromatin conformation or more accessible genomic locations? sentence open JJ chromatin NN conformation NN accessible JJ Transcription of apparent 'singleton' olfactory receptor genes (0.5 Mb or more from the nearest other olfactory receptor gene) suggests that there is no absolute requirement for genomic clustering for an olfactory receptor to be transcribed, consistent with observations that small olfactory receptor transgenes can be expressed correctly when integrated outside native olfactory receptor clusters [35]. sentence Transcription NN clustering NN observations NNS correctly RB integrated VBN outside IN However, the high pseudogene count among singleton olfactory receptor genes (50%, versus 20% for clustered olfactory receptor genes) suggests that not all genomic locations are favorable for olfactory receptor gene survival, perhaps due to transcriptional constraints. sentence count NN 50 CD clustered VBN favorable JJ survival NN constraints NNS It is also possible that evolutionary factors may be responsible for reduced pseudogene content of clustered olfactory receptors - gene conversion between neighboring olfactory receptors could rescue inactivating mutations in clustered genes, but not singletons. sentence evolutionary JJ reduced JJ content NN conversion NN neighboring VBG rescue VB inactivating VBG mutations NNS singletons NNS Before these questions about olfactory receptor gene choice can be answered, it will be important to measure expression levels of a larger number of genes, perhaps using an olfactory receptor gene microarray. sentence Before IN choice NN answered VBN microarray NN " Our study provides at least partial data about the upstream transcript structures of over 300 olfactory receptor genes." sentence partial JJ structures NNS These data provide tentative locations of a large set of promoter regions, allowing computational searches for shared sequence motifs that might be involved in the intriguing transcriptional regulation of olfactory receptors. sentence tentative JJ allowing VBG involved VBN However, given that not all cDNAs are full-length clones, some of these candidates will not be true promoter regions. sentence candidates NNS The 5' UTR sequences we obtained will also aid in the design of experimental probes, for example, for in situ hybridizations or to immobilize on an olfactory receptor microarray. sentence aid VB design NN in FW hybridizations NNS immobilize VB One of the challenges of such an array will be to design unique probes with which to represent each gene. sentence challenges NNS array NN design VB represent VB Often, the coding region of olfactory receptors is highly similar between recently duplicated genes. sentence Often RB recently RB duplicated VBN Many pairs of similar olfactory receptors show more sequence divergence in the UTRs than the protein-coding region (J.Y., unpublished observations). sentence Many JJ divergence NN J.Y. NNP unpublished JJ The UTRs would therefore make a better choice of sequence from which to design unique oligonucleotides to distinguish closely related olfactory receptor genes. sentence oligonucleotides NNS distinguish VB closely RB related JJ Locations of these regions in genomic sequence are difficult to predict - our study provides 5' UTR sequences of 343 genes and the approximate 3' UTR length for 399 olfactory receptor genes. sentence Locations NNS 343 CD 399 CD Probe design must also account for the multiple transcriptional isoforms observed for many olfactory receptors - depending on the question being asked, probes could be designed in shared sequence to determine the total level of all isoforms, or in unique exons to measure the level of each isoform separately. sentence Probe NN must MD depending VBG question NN asked VBD designed VBN separately RB " We find that the majority of the olfactory receptors, like most non-olfactory receptor genes [38,39], are transcribed as multiple isoforms, involving alternative splicing of 5' untranslated exons and alternate polyadenylation-site usage." sentence majority NN like IN 39 CD involving VBG usage NN The act of splicing itself may be important for efficient mRNA export from the nucleus [40] or to couple olfactory receptor coding regions with genomically distant promoters. sentence act NN splicing VBG itself PRP efficient JJ mRNA NN export NN nucleus NN couple VB promoters NNS The exact nature of the spliced transcript might be unimportant, such that several isoforms might be produced simply because multiple functional splice sites are available. sentence exact JJ nature NN unimportant JJ produced VBN Alternatively, the multiplicity of transcriptional isoforms might have functional significance, as UTRs may contain signals controlling mRNA stability, localization or degradation [41,42]. sentence multiplicity NN significance NN controlling VBG stability NN localization NN degradation NN 41 CD 42 CD " Our study shows that about 5% of olfactory receptor transcripts do not fit the current notion of olfactory receptor gene structure." sentence fit VB current JJ notion NN Occasionally, an intron is spliced out of the 3' untranslated region. sentence Occasionally RB A number of cDNAs use splice sites within the olfactory receptor's ORF, meaning that their protein product is different from that predicted on the basis of genomic sequence alone. sentence meaning VBG product NN basis NN alone RB In two such cases, the transcript would encode a functional olfactory receptor, with the initiating methionine and first few amino acids encoded by an upstream exon, as has been observed previously for a subtelomeric human olfactory receptor gene [25]. sentence subtelomeric JJ Such within-ORF splicing might increase protein-coding diversity, although, given the small number of genes involved, splicing is unlikely to significantly affect the functional receptor repertoire. sentence Such JJ affect VB Most of the atypical splice forms we observe appear to encode non-functional transcripts, containing frameshifts or lacking a start codon or other functional residues conserved throughout the olfactory receptor family. sentence non-functional JJ frameshifts NNS lacking VBG throughout IN These nonfunctional transcripts are probably aberrant by-products of the splicing system [43] that have not yet been degraded by RNA surveillance systems [40,41]. sentence by NN system NN yet RB degraded VBN surveillance NN systems NNS The neurons expressing these aberrant transcripts might also make normal transcripts for the same genes and thus produce a functional olfactory receptor. sentence produce VB Alternatively, the unusual transcriptional regulation of olfactory receptors might ensure that only one splice isoform is expressed per cell (unlikely, but possible if an RNA-based feedback mechanism operates), thus condemning cells expressing these aberrant isoforms to be dysfunctional. sentence operates VBZ condemning VBG " We also observe transcripts from a small number of olfactory receptor pseudogenes, as previously described for three human olfactory receptor pseudogenes [26,44]." sentence 44 CD Although many fewer pseudogenes than intact genes were represented in our cDNA collection, some neurons in the olfactory epithelium evidently express disrupted olfactory receptors and thus might be unable to respond to odorants or to correctly innervate the olfactory bulb. sentence Although IN evidently RB express VBP disrupted JJ unable JJ innervate VB bulb NN Wang, Axel and coworkers have shown that an artificial transgenic olfactory receptor gene containing two nonsense mutations can support development of an olfactory neuron, but that pseudogene-expressing neurons fail to converge on a glomerulus in the olfactory bulb [45]. sentence Wang NNP Axel NNP coworkers NNS artificial JJ transgenic JJ nonsense JJ support VB development NN fail VBP converge VB glomerulus NN By analogy with an olfactory receptor deletion mutant [45], it is likely that most pseudogene-expressing neurons die or switch to express a different olfactory receptor gene, leaving a small number of pseudogene-expressing neurons in adult mice, but at greatly reduced levels compared to neurons expressing intact olfactory receptors. sentence By IN analogy NN deletion NN mutant NN die VBP switch VBP greatly RB resource NN We have thus established over 400 annotated olfactory receptor genes as having olfactory function. sentence established VBN annotated JJ having VBG The sequences we generated demonstrate that the majority of the olfactory receptor gene family has multiple transcriptional isoforms. sentence generated VBD Most olfactory receptor transcripts encode functional receptor proteins, with rare exceptions. sentence exceptions NNS We show that individual olfactory receptor genes can have vastly different expression levels, an intriguing finding in light of the unusual one-neuron one-gene transcriptional regime of the olfactory epithelium. sentence light NN Our results and the sequences we provide will facilitate future global studies of the mechanisms and dynamics of olfactory receptor gene expression. sentence results NNS facilitate VB future JJ global JJ dynamics NNS Identification NN " An adult mouse cDNA library made from the olfactory epithelium of a single animal was provided by Leslie Vosshall (Rockefeller University, New York, NY, USA), and an embryonic library (made from the olfactory epithelia of several E16.5-E18.5 embryos) was provided by Tyler Cutforth (Columbia University, New York, NY, USA)." sentence An DT made VBN animal NN Leslie NNP Vosshall NNP Rockefeller NNP University NNP New NNP York NNP NY NNP USA NNP epithelia NNS E16.5 NN E18.5 NN embryos NNS Tyler NNP Cutforth NNP Columbia NNP Both libraries were oligo-dT primed and directionally cloned into the lambdaZAP-XR vector (Stratagene, La Jolla, CA, USA). sentence Both DT primed VBN directionally RB cloned VBN lambdaZAP NN XR NN vector NN Stratagene NNP La NNP Jolla NNP CA NNP The adult library has a complexity of 6.5 × 106 primary clones, and the embryonic library has a complexity of 1.65 × 106. sentence complexity NN 6.5 CD × SYM 106 CD primary JJ 1.65 CD Libraries were amplified to give titers of 5 × 109 pfu/ml (adult) or 2 × 1010 pfu/ml (embryonic). sentence Libraries NNS amplified VBN give VB titers NNS 109 CD / SYM ml NN 1010 CD Hybridization probes were made by degenerate PCR of mouse genomic DNA, in a fashion similar to those described previously [1], with primer pairs and annealing temperatures given in Table 2. sentence Hybridization NN fashion NN annealing VBG temperatures NNS Low-stringency hybridization conditions were as described [1]. sentence Low JJ conditions NNS Clonally-pure plaques were obtained through secondary screens using the same probe as the corresponding primary screen. sentence Clonally RB pure JJ plaques NNS through IN PCR with vector primers (M13F/R) was performed to prepare sequencing templates. sentence M13F NN R NN performed VBN prepare VB cDNA size estimates were obtained by agarose gel electrophoresis, and inserts were sequenced from the 5' end using the M13R primer and big-dye terminator chemistry according to ABI's protocols (Applied Biosystems, Foster City, CA, USA). sentence estimates NNS agarose NN gel NN electrophoresis NN inserts NNS M13R NN big JJ dye NN terminator NN chemistry NN ABI NNP protocols NNS Applied NNP Biosystems NNP Foster NNP City NNP In order to obtain 3' sequence, selected phage clones were converted to plasmid stocks following a scaled-down version of Stratagene's in vivo excision protocol. sentence phage NN converted VBN plasmid NN stocks NNS scaled VBN down RP vivo FW excision NN protocol NN Plasmid DNA gave better 3'-end sequence than PCR products, which often suffered from polymerase stuttering through the poly(A) tract. sentence Plasmid NN gave VBD suffered VBD polymerase NN stuttering VBG tract NN " cDNA sequences and associated information are available through dbEST (Genbank accessions CB172832-CB174569) and our olfactory receptor database [46]." sentence associated VBN dbEST NN Genbank NNP accessions NNS CB172832 NN CB174569 NN 46 CD The updated olfactory receptor gene catalog is available through Genbank (accessions AY317244-AY318733). sentence updated JJ AY317244 NN AY318733 NN Throughout the manuscript, genes are referred to by their Genbank accession numbers. sentence Throughout IN manuscript NN accession NN " cDNA sequences were base-called and quality-trimmed using phred (trim_cutoff = 0.05) [47], and vector sequences were removed using cross_match [48]." sentence base NN called VBN trimmed VBN phred NN trim_cutoff NN 0.05 CD 47 CD removed VBN cross_match NN 48 CD Any sequences of less than 50 bp after trimming were discarded. sentence Any DT less JJR trimming NN discarded VBN 3' UTR lengths were estimated by combining approximate insert sizes determined by PCR with 5' sequence data where possible (if the 5' sequence did not extend into the coding region we could not estimate 3' UTR size). sentence lengths NNS sizes VBZ estimate VB We counted cDNAs from a given gene as showing alternative polyadenylation site usage if 3' UTR length estimates varied by at least 400 bp - smaller variation could be real, but may not be distinguishable from error in our size estimates. sentence counted VBD varied VBD distinguishable JJ " To assign cDNAs to their corresponding olfactory receptor genes, we first defined a genomic 'territory' for each gene, with the following attributes: strand, start position (100 kb upstream of the start codon or 1 kb after the previous gene upstream on the same strand, whichever is closer) and end position (1 kb downstream of the stop codon)." sentence To TO assign VB defined VBD attributes NNS position NN 100 CD kb NNS whichever WDT closer JJR downstream RB Trimmed sequences were compared with genomic sequences using sim4 [30] (settings P = 1 to remove polyA tails and N = 1 to perform an intensive search for small exons). sentence Trimmed JJ settings NNS remove VB polyA NN tails NNS N NN intensive JJ search NN The sim4 algorithm uses splice-site consensus sequences to refine alignments. sentence algorithm NN uses VBZ consensus NN refine VB Only matches of 96% or greater nucleotide identity were considered. sentence Only RB considered VBN RepeatMasked sequences [49] were also compared to genomic sequences; cDNA:genomic sequence pairings not found in both masked and unmasked alignments were rejected. sentence RepeatMasked JJ 49 CD : : pairings NNS masked JJ unmasked JJ rejected VBN Coordinates from the unmasked alignment were used for further analysis. sentence Coordinates NNS alignment NN further JJ Any cDNA sequence matching entirely within a territory was assigned to that gene. sentence entirely RB If a cDNA matched more than one gene territory, the best match was chosen (that is, the one with highest 'score', where score is the total of all exons' lengths multiplied by their respective percent identities). sentence highest JJS score NN ' POS multiplied VBN respective JJ identities NNS We found 27 cDNAs that spanned a larger genomic range than one gene territory and flagged them for more careful analysis. sentence spanned VBD range NN flagged VBD them PRP careful JJ Of these, six cDNAs showed unusual splicing within the 3' UTR, but the remaining 'territory violators' were found to be artifacts of the analysis process which fell into three types. sentence violators NNS These included: cDNAs where the insert appeared to be cloned in the reverse orientation (six cDNAs); sequences from recently duplicated gene pairs, where sim4 assigned coding region and upstream exons to different members of the pair, although exons could equally well have been aligned closer to one another (six cDNAs); and artifacts due to use of sim4's N = 1 parameter (nine cDNAs). sentence included VBD appeared VBD orientation NN equally RB closer RBR parameter NN This parameter instructs the program to make extra effort to match small upstream exons, allowing a greater total length of EST sequence to be matched. sentence instructs VBZ program NN extra JJ effort NN However, occasionally the N = 1 parameter caused the program to assign very small sequences (1-4 bp) to distant upstream exons, when they probably match nearer to the corresponding coding sequence. sentence caused VBD match VBP nearer RBR " The expected distribution shown in Figure 2 was calculated using the equation P(x) = e-μμx/x!, where P(x) is the Poisson probability of observing x cDNAs per gene, and μ is the mean number of cDNAs observed per gene (μ = 1,176/983: 1,176 cDNAs matching olfactory receptor genes in our dataset and 983 intact class II olfactory receptors)." sentence expected JJ equation NN x NN e NN μμx NN ! SYM μ NN mean JJ 176 CD In our analysis of expressed pseudogenes, we ignored two olfactory receptor pseudogenes found very near the ends of genomic sequences and thus likely to be error-prone. sentence ignored VBD ends NNS prone JJ A protein sequence alignment of intact mouse olfactory receptors was generated using CLUSTALW [50], edited by hand, and used to produce the phylogenetic tree shown in Figure 1 using PAUP's neighbor-joining algorithm (v4.0b6 Version 4, Sinauer Associates, Sunderland, MA). sentence generated VBN CLUSTALW NN edited VBN hand NN tree NN PAUP NN neighbor NN joining VBG v4.0b6 NN Version NN Sinauer NNP Associates NNPS Sunderland NNP MA NNP The tree was colored using a custom script. sentence Information content (the measure of sequence conservation shown in Figure 6) was calculated for each position in the alignment using alpro [51]. sentence Information NN measure NN conservation NN alpro NN 51 CD " To determine the number of transcriptional isoforms for each gene, we examined the sim4 output for every matching cDNA in decreasing order of number of exons." sentence examined VBD output NN every DT decreasing JJ The first cDNA was counted as one splice form, and for each subsequent cDNA, we determined whether exon structure was mutually exclusive to isoforms already counted. sentence counted VBN subsequent JJ mutually RB exclusive JJ already RB We were conservative in our definition of mutually exclusive, and thus our count represents the minimum number of isoforms represented in the cDNA collection. sentence conservative JJ definition NN represents VBZ minimum JJ " The olfactory epithelia were dissected from three adult female C57BL/6 mice, including tissues attached to the skull and septum." sentence dissected VBN female JJ C57BL NN attached VBN skull NN septum NN RNA was isolated using the Qiagen RNeasy midi kit (Qiagen, Valencia, CA, USA), including a DNase treatment step. sentence Qiagen NNP RNeasy NNP midi NN kit NN Valencia NNP DNase NN treatment NN First-strand cDNA was produced from 2.5 μg of RNA in a volume of 50 μl using random hexamers and Invitrogen's Superscript II reverse transcriptase (Invitrogen, Carlsbad, CA, USA), according to the manufacturer's recommendations. sentence First JJ 2.5 CD μg NNS volume NN μl NNS hexamers NNS Invitrogen NNP Superscript NN transcriptase NN Carlsbad NNP manufacturer NN recommendations NNS One-twenty-fifth of the resulting cDNA was used as template in subsequent PCR reactions. sentence twenty CD fifth NN template NN reactions NNS PCR amplification biased towards class I olfactory receptors was performed using degenerate primers P26 [17] and classI_R1 (5'-GGRTTIADIRYIGGNGG-3') with an annealing temperature of 44°C. sentence towards IN classI_R1 NN GGRTTIADIRYIGGNGG NN annealing JJ temperature NN °C NNS The product was cloned (TA cloning kit, Invitrogen), and individual clones were sequenced. sentence TA NN Specific PCR primers used to confirm expression of individual olfactory receptor genes are given in Additional data file 1. sentence Specific JJ file NN Each PCR product was sequenced to confirm that the expected gene and no others had been amplified. sentence expected VBN Control reactions on a template made by omitting reverse transcriptase gave no product, indicating that the RNA preparation was uncontaminated by genomic DNA. sentence Control NN omitting VBG indicating VBG preparation NN uncontaminated JJ " Relative transcript levels were estimated using real-time PCR according to Applied Biosystems' protocols, with magnesium concentration, primer pair and fluorescent probe given in Additional data file 2." sentence Relative JJ magnesium NN concentration NN fluorescent JJ The increase in fluorescence during thermocycling was measured on an ABI PRISM 7900HT. sentence increase NN fluorescence NN during IN thermocycling NN measured VBN ABI NN PRISM NN 7900HT NN Standard curves were constructed for each primer pair using triplicate samples of mouse genomic DNA of nine known concentrations (range 0.01-100 ng, or about 3-30,000 copies of the haploid genome). sentence Standard JJ curves NNS constructed VBN triplicate NN samples NNS concentrations NNS 0.01 CD ng NNS 30,000 CD haploid NN Relative expression level of each gene was determined by comparing the mean Ct (cycle where fluorescence reaches an arbitrary threshold value) obtained with six replicate samples of reverse-transcribed RNA to the standard curve for the corresponding primers. sentence Ct NN cycle NN reaches VBZ value NN replicate NN transcribed JJ Relative RNA levels of a housekeeping gene, ribosomal S16, were measured as previously described [52]. sentence housekeeping NN 52 CD Control reactions on template prepared by omitting reverse transcriptase amplified at a relative level of 0.03 ± 0.01 ng or less in each case. sentence prepared VBN 0.03 CD ± SYM case NN Expression measurements of the seven genes were normalized for each mouse so that S16 levels were equal to 1 (arbitrary units). sentence measurements NNS normalized VBN so IN " Coronal sections were cut from the olfactory epithelia of an adult mouse (Figure 4) and a young (P6) C57BL/6 mouse." sentence Coronal JJ cut VBN RNA in situ hybridization was carried out as described previously [15,53] with digoxigenin-labeled antisense riboprobes specific for the 3' UTRs of genes AY318555 (0.5 kb) and AY317365 (0.5 kb). sentence carried VBN 53 CD digoxigenin NN labeled VBN antisense JJ riboprobes NNS AY318555 NN AY317365 NN Riboprobe sequences were generated by PCR using primer pairs 5'-TCTTCCAAACCTGGACCCCCC-3' and 5'-ATCTCTCCAGCACCTTACTTG-3' for AY318555 and primer pairs 5'-TAAGATGTAAGTGATAATTTAGATTACAGG-3' and 5'-TTTCTGCCTCAGCTATGACAG-3' for AY317365. sentence Riboprobe NN TCTTCCAAACCTGGACCCCCC NN ATCTCTCCAGCACCTTACTTG NN TAAGATGTAAGTGATAATTTAGATTACAGG NN TTTCTGCCTCAGCTATGACAG NN Hybridization was carried out in 50% formamide at 65°C, and slides were washed at high stringency (65°C, 0.2 × SSC). sentence formamide NN 65 CD slides NNS washed VBN 0.2 CD SSC NN The probes each hybridize to only one band on a Southern blot, indicating that each probe only detects one olfactory receptor gene. sentence hybridize VBP band NN detects VBZ BLAST analyses show that the AY318555 probe is unique in Celera's mouse genome assembly (Release 13), and that the AY317365 probe is similar to only one other genomic region. sentence This potential cross-hybridizing region is over 10 Mb from the nearest olfactory receptor coding region and is thus highly unlikely to be included in any olfactory receptor transcript. sentence potential JJ cross-hybridizing JJ included VBN Low-power images are composed of three overlapping micrographs (40×) assembled in Adobe Photoshop 7.0. sentence power NN images NNS composed VBN overlapping VBG micrographs NNS assembled VBN Adobe NNP Photoshop NN 7.0 CD list NN The experimental conditions used for real-time PCR can be found in Additional data file 2. sentence