Cloning of mouse mr-s In order to identify novel mouse genes preferentially expressed in the developing retina, we screened the National Institute for Biotechnology Information (NCBI) database, UniGene, using Digital Differential Display (DDD) and found EST fragments which are frequently present in mouse retinal cDNA libraries. We found that one clone in these cDNAs encodes a protein containing a SAM domain related to that of polyhomeotic protein. A PCR fragment corresponding to this mouse clone was used to screen a mouse P0-P3 retinal cDNA library to obtain a full-length cDNA clone. Sequence analysis showed that this cDNA was a novel gene encoding a SAM domain-containing protein. We referred to this protein as mr-s (major retinal SAM domain protein). As shown in Fig. 1A, a translation initiation codon is present in the same open reading frame as the SAM domain. This initiation site shows similarity to the consensus sequence proposed by Kozak [32] including the presence of the highly conserved purine at position -3. The stop codon of the predicted mr-s protein is also indicated in Fig. 1A. The amino acid sequence of the SAM domain of mr-s protein (Fig. 1A, boxed sequence) exhibits homology with SAM domains of EphB2, EphA4, MPH1, TEL and Smaug (Fig. 1B). By phylogenetic analysis, the SAM domain of mr-s is most closely related to that of Mph1/Rae28, a mouse homolog of ph (Fig. 1C). Mouse mr-s protein is conserved in rat, human, chick and zebrafish, which display 91%, 70%, 36% and 26% identity with mouse mr-s, respectively (Fig. 1D). The SAM domains of rat, human, chick and zebrafish mr-s protein are highly conserved and display 96%, 90%, 76% and 72% identity with the SAM domain of mouse mr-s protein. The chromosomal localizations of mouse and human mr-s genes were determined by searching the mouse and human genome databases (NCBI), respectively. Mouse mr-s is mapped to chromosome 4E2, and human MR-S is mapped to chromosome 1p36.33. LCA is the most common cause of inherited childhood blindness. Human MR-S maps in the vicinity region of the LCA9, recently identified as a new locus for LCA [33]. Figure 1 mr-s nucleotide and amino acid sequences. (A) mr-s nucleotide and amino acids sequences. Boxed amino acids are the SAM domain sequence and the dashed box indicates a putative nuclear localization signal. The underline indicates a putative polyadenylation termination signal. (B) Alignment of SAM domain sequences for SAM domain-containing proteins. The five alpha helices are marked H1-H5. Conserved amino acid residues are shown with a dark shadow and functionally similar residues are shown with a light shadow. The sites that were targeted for mutagenesis are indicated by arrows. (C) Phylogenetic tree of SAM domain-containing proteins. Amino acid sequences were analyzed by the neighbor-joining method in MacVector 7.2. Branch lengths reflect the mean number of substitutions per site. (D) Schematic comparison of the amino acid sequences for mouse, rat, human, chick and zebrafish mr-s proteins. The percent similarity of the SAM domains and other regions to the corresponding regions of the mouse protein is shown. Overall sequence similarity with the mouse protein is shown on the right.