> top > docs > PubMed@Masakazu SAGA:23580252

PubMed@Masakazu SAGA:23580252 JSONTXT

Shotgun Proteomic Analysis on the Diapause and Non-Diapause Eggs of Domesticated Silkworm Bombyx mori To clarify the molecular mechanisms of silkworm diapause, it is necessary to investigate the molecular basis at protein level. Here, the spectra of peptides digested from silkworm diapause and non-diapause eggs were obtained from liquid chromatography tandem mass spectrometry (LC-MS/MS) and were analyzed by bioinformatics methods. A total of 501 and 562 proteins were identified from the diapause and non-diapause eggs respectively, of which 309 proteins were shared commonly. Among these common-expressed proteins, three main storage proteins (vitellogenin precursor, egg-specific protein and low molecular lipoprotein 30 K precursor), nine heat shock proteins (HSP19.9, 20.1, 20.4, 20.8, 21.4, 23.7, 70, 90-kDa heat shock protein and heat shock cognate protein), 37 metabolic enzymes, 22 ribosomal proteins were identified. There were 192 and 253 unique proteins identified in the diapause and non-diapause eggs respectively, of which 24 and 48 had functional annotations, these unique proteins indicated that the metabolism, translation of the mRNA and synthesis of proteins were potentially more highly represented in the non-dipause eggs than that in the diapause eggs. The relative mRNA levels of four identified proteins in the two kinds of eggs were also compared using quantitative reverse transcription PCR (qRT-PCR) and showed some inconsistencies with protein expression. GO signatures of 486 out of the 502 and 545 out of the 562 proteins identified in the diapause and non-diapause eggs respectively were available. In addition, Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway analysis showed the Metabolism, Translation and Transcription pathway were potentially more active in the non-dipause eggs at this stage. Introduction Diapause is a special physiological state of arrested development by many insects to avoid unfavorable environments such as low temperature, drought or food shortage and to synchronize their life cycles to these changes. Diapause is widespread among insects and can occur at any stage of the life cycle, i.e., adult, pupa, larva or egg, and each species enters diapause at fixed stages. Diapause is endogenously controlled, and this dormancy typically begins well before conditions become too harsh to support normal growth and development. In the silkworm Bombyx mori, the development of diapause-destined embryos is arrested during the G2 cell cycle stage immediately after formation of the cephalic lobe and telson and sequential segmentation of the mesoderm. In Bombyx mori, embryonic diapause is determined by a diapause hormone that is secreted by the suboesophageal ganglion (SG) of the mother moth during the pupal period to act on her developing ovaries and is responsible for induction of embryonic diapause of the silkworm, Bombyx mori . The termination of diapause requires a low temperature of 5°C for 2–3 months, various stimuli can also artificially prevent or terminate diapause, such as HCl, physical stress, corona discharge, or higher oxygen pressure. Once diapause terminates, the embryos are competent to resume development at 25°C and cells enter the M phase; Cell division in the embryos then resumes. However, the molecular mechanisms involved in generating, maintaining, and breaking diapause have not yet been fully elucidated. Currently, only a few studies have elaborated the molecular mechanisms that regulated diapause. Most of these studies have mentioned the molecular regulation of diapause in the larvae, pupae and adults stage. However, the molecular regulation of embryonic diapause is still unclear, perhaps because of the difficulty of extracting the limited amounts of RNA from a developmental stage that has relatively little tissue and large amounts of protein and lipid. The gene expression profiles linked with diapause have been researched in many insects, such as northern malt fly species, flesh fly, Chinese oak silkworm, Allonemobius socius, Drosophila melanogaster, Colorado potato beetle. Nevertheless, gene expression profile alone is not sufficient to explain gene functions, and mRNA levels incompletely correlate with the protein levels because of the alternative splicing and dynamics of gene translation. Therefore, with the gradual completion of genome sequencing of insect model organisms, proteomics has become the focal point in entomological research including the diapause mechanism. For a long time, two-dimensional gel electrophoresis (2-DE) combined with mass spectrometry (MS) has been frequently used in insect proteomic research. Another relevant approach in proteomics for large-scale characterization of proteome profiles is shotgun proteomics, which is based on in-gel or gel free digestion of protein mixtures followed by liquid chromatography tandem mass spectrometry (LC-MS/MS) separation, MS detection and database searching, and provides an extremely sensitive and high-throughput approach to determine the proteome components in a complex biological sample. This approach has been widely used in many organisms, such as Bombyx mori , Homo sapiens and Orientia tsutsugamushi . Following the rapid development of proteomics and bioinformatics approaches, the credibility of results is greatly promoted. In this study, we utilized the shotgun LC–MS/MS approach combined with bioinformatics analysis to illuminate the differences among protein identification profiles of the diapause and non-diapause eggs of the silkworm and to find valuable clues regarding the molecular regulation of diapause at the early stage of embryogenesis in the silkworm. Materials and Methods Silkworm Rearing and Sample Collection Bivoltine silkworm strain 932 was protected at 20°C with half on a short-day photoperiod (8-h-light:16-h-dark for the generation of non-diapause eggs) and half on a long-day photoperiod (16-h-light:8-h-dark for the generation of diapause eggs) separately during the hatching period of silkworm eggs. Larvae were reared on fresh mulberry leaves under an environment of 24 to 25°C and 85% relative humidity. Pupae were kept at 25°C. After female moths emerged, they copulated with males (usually within 5 h) and then laid the eggs at 25°C. Diapause and non-diapause eggs which laid by 30 female moths seperately within 1 h were collected to obtain synchronously developing egg batches. All the diapause and non-diapause eggs were collected 24 h after oviposition and divided into 0.1 g each sample (corresponding to about 200 eggs, three copies). All samples were stored at −80°C until use. Sample Preparation and SDS-PAGE Separation Samples of diapause eggs (D) and non-diapause eggs (ND) were mechanically homogenized on ice for 10 min in 10 µL lyses buffer (comprising 8 M urea, 2 M thiourea, 2% CHAPS, 20 mM Tris–HCl, 30 mM DTT) per 1 mg tissue. The samples were then sonicated in an ice-bath for five circles and each circle contained a 30 s sonication with a 30 s interval. The samples were then centrifuged at 12,000×g at 4°C for 15 min. The supernatants were then collected and were centrifuged again and the resultant supernatants were stored at −20°C for further study. The concentrations of protein samples were determined by the Bradford methods.The samples were boiled for 2 min and centrifuged at 12,000×g for 10 min before they were subjected to SDS-PAGE, using a 5% stacking gel and a 12.5% resolving gel. For each sample, a total of 300 µg protein was separated using SDS-PAGE on three lanes with 100 µg each lane. The gels were stained with Coomassie Brilliant Blue (CBB) R250 (Sigma, USA) after electrophoresis. In-gel Digestion Each gel lane was manually cut into 8 bands according to the deepness of Coomassie staining (Fig. 1). The gel bands were sliced into 1×1 mm pieces and subjected to in-gel tryptic digestion which was essentially carried out as described by Shevchenko et al.. Briefly, the gel pieces were rinsed thrice using Milli-Q water (Millipore) and destained twice with 25 mM NH4HCO3 in 50% acetonitrile (ACN, Amersham) at 37°C until the color depigmented completely. The dried gels were incubated with 50 mM Tris[2-carboxyethyl]phosphine (TCEP, Sigma) in 25 mM NH4HCO3 at 56°C for 1 h to reductively cleave the disulfide bonds of proteins and then the resulting sulfhydryl functional groups were alkylated by 100 mM iodoacetamide (IAA, Amersham) in 25 mM NH4HCO3 at room temperature in the dark for 0.5 h. Gel pieces were washed twice with 25 mM ammonium bicarbonate in 50% acetonitrile solution, dehydrated twice with 100% acetonitrile, and dried in a vacuum centrifuge. Subsequently, the proteins were digested with 20 ng/µL modified trypsin (Sigma) for 20 h at 37°C.The resulting peptides were extracted twice from the gel pieces with 5% trifluoroacetic acid (TFA, Fluka, Milwaukee, WI, USA) in 50% ACN solution. The pooled extracts were evaporated in a vacuum centrifuge (Labconco, Kansas, MO), and resuspended with 0.1% methanoic acid (Sigma). After trypsin digested, 8 slices were combined to 4 groups for each sample (slice 1 and 2, slice 3 and 4, slice 5 and 6, slice 7 and 8) prior to the LC separation and MS detection. SDS-PAGE patterns of the diapause (D) and non-diapause (ND) eggs protein samples respectively. The samples were separated by 12.5% resolving gel in triplicate. The three batches for both ND and D were pooled into two sets of eight prior to in-gel digest. Then eight slices were combined to four groups for each sample (comprising slices 1 and 2, slices 3 and 4, slices 5 and 6, slices 7 and 8 for both ND and D) prior to the LC separation and MS detection. Shotgun LC-MS/MS Analysis All digested peptide mixtures were separated by reverse phase (RP) HPLC followed by tandem MS analysis. RP-HPLC was performed on a surveyor LC system (Thermo Finnigan, San Jose, CA). Samples were loaded into a trap column (Zorbax 300SB-C18 peptide traps, 300 µm × 65 mm, Agilent Technologies, Wilmington, DE) first at a 3 µL/min flow rate for peptides enrichment and desalting. After flow-splitting down to about 1.5 µL/min, peptides were transferred to the analytical column (RP-C18, 150 µm × 150 mm, Column Technology, Inc., Fremont, CA) for separation with a 195 min linear gradient from 95% buffer A (0.1% methanoic acid in water) to 50% buffer B (84% ACN, 0.1% methanoic acid in water) at a flow rate of 250 nL/min. The analytical column was regenerated for 20 min with buffer A before loading the next sample. A Finnigan LTQ linear ion trap mass spectrometer equipped with a nanospray source was used for the MS/MS experiment in the positive ion mode. The temperature of the ion transfer capillary was set at 170°C. The spray voltage was 3.0 kV and normalized collision energy was set at 35.0% for MS/MS. The MS analysis was performed with one full MS scan (m/z 400–1800 with a resolution R = 60,000 at m/z 400) followed by 10 MS/MS scans on the 10 most intense ions from the MS spectrum with the dynamic exclusion settings: repeat count 2, repeat duration 30 s, exclusion duration 90 s. Data were acquired in data dependent mode using Xcalibur software (version 2.0.7, Thermo Fisher Scientific). Database Search Database search was carried out against the in-house database, with the protein sequences downloaded from NCBInr protein database (http://www.ncbi.nlm.nih.gov/) including the domesticated silkworm (B.mori,2224 entries), the wild silkworm (B. mandarina,13 entries) and the fruit fly (D. melanogaster, 27777 entries), along with predicted B. mori protein sequence data (14,623 entries). _The four raw datasets of ND and D samples were searched against the in-house database on a local server using Turbo SEQUEST software (Bioworks version 3.2, Thermo Finnigan). The mass tolerances of precursor ion and fragmentation ion were set to 1.5 Da and 1.0 Da, respectively. The trypsin-cleavage was at both ends of the protein and two missing cleavage sites were allowed. Only b and y fragment ions were taken into account. Static (carbamidomethyl) modification on cysteine, and variable modifications (oxidation) on methionine were set for all searches. The results for each dataset were stored as.out format files. All the.out files were filtered by Buildsummary software. The protein identification criteria that we used were based on Delta CN (≥0.1) and Xcorr (one charge ≥1.9, two charges ≥2.2, three charges ≥3.75). Quantitative RT-PCR Total RNA was extracted from diapause and non-diapause eggs samples using TRIzol reagent (Invitrogen, Carlsbad, CA, USA) according to the manufacturer's instructions and then reverse-transcribed into cDNA with random primer and M-MLV reverse transcriptase (Promega, Madison, WI, USA). The qRT-PCR was performed in 25 µL reactions with 100 ng reverse transcription product, 200 nM each of the forward and reverse primer, and the SYBR® Premix Ex Taq™ (Takara, Tokyo, Japan). The cDNA was amplified in a Rotor-Gene 3000 real-time PCR system (Corbett Research, Sydney, Australia) according to the following program: initial denaturation at 95°C for 10 min, and 40 cycles for amplification at 95°C for 30 s, 56°C for 30 s and 72°C for 30 s followed by an additional steps for melting curve with the increase of temperature from 72 to 95°C at 0.5°C/6 s and 30°C for 30 s. The expression levels were calculated using Rotor-Gene software (version 6.0.19) on the basis of the difference of Ct value by normalizing with the reference gene (BmactinA3, accession No.X04507). The diapause-unique gene BGIBMGA001595 (Aliphatic nitrilase, BmnitA)and the non-diapause-unique genes BGIBMGA002594 (adenylate kinase 2, BmADK2), BGIBMGA011412 (Isocitrate dehydrogenase, BmIDH), BGIBMGA002521 (gamma-glutamyltransferase, BmGGT) were chosen for the qRT-PCR. The qRT-PCR was performed using the following primers: BmnitA forward, TCGGGAAACATCGCAAGAAC and reverse, GCCGTATCTGGTCGCAAATAC, BmADK2 forward, ATCACGCTCAAACGGTTCCTT and reverse, AGACGTCATCAGCGGCTTTC, BmIDH forward, CGTTGCGACCAGACATAAGGA and reverse, TTCACGGATTCGTGTTCCAA, BmGGT forward, GACAGCCTCAAACCCAATCAG and reverse, GCGGCCATAAAGCCATCTC. Bioinformatics Analysis Protein sequences were searched against InterPro member databases using the InterProScan software to identify protein signatures. The compiled RAW outputs were subjected to gene ontology (GO) category analysis using the Web Gene Ontology Annotation Plot (WEGO). The two groups of datasets were simultaneously subjected to online analysis (http://wego.genomics.org.cn/cgi-bin/wego/index.pl) to conveniently compare them in one graph. The P-values were calculated using Pearson Chi-Square test where available. The proteins identified were classified into cellular component, molecular function and biological process. The pathways related to the identified proteins were predicted by searching against KEGG reference pathway database (http://www.genome.jp/kegg/tool/search_pathway.html) with the available protein sequences. Results and Discussion Proteome Profiles of the D Compared with ND Samples The shotgun proteomics strategy, based on proteolytic digestion of complex protein mixtures, peptides LC separation and tandem MS sequencing, has been widely utilized. However, database searching remains the bottleneck for many shotgun proteomics experiments. So in our research, the 4 raw data of each sample sets from LC-MS/MS were subjected to in-house database search using SEQUEST. A total of 501 and 562 proteins were identified from the D and ND respectively (Fig. 2), with 8091 peptides including 1560 unique peptides and 8125 peptides including 1826 unique peptides respectively. The number of common-expressed proteins among the two samples was 309, which was 61.68% and 54.98% of the proteins in the D and ND respectively. Moreover, there were 135 common-expressed proteins with functional annotations (Table S1), the others were predicted proteins from the silkworm genome database. There were 192 and 253 unique proteins identified in the D and ND respectively, of which 24 and 48 had functional annotations (Table 1 and 2). Venn diagram shows the numbers of identified proteins from diapause and non-diapause eggs of the silkworms. Each number with no overlap of circles shows the number of proteins uniquely observed in that sample, while overlapping circle shows the numbers of identified proteins common to two of the analyzes. Table 1 the unique expressed proteins of D with functional annotation. Gene ID/GI number Theor.pI/Mw(kDa) No. of peptides No. of unique peptides Cover percent (%) Protein description Unique peptide sequences BGIBMGA013139 gi|153791691| 7.36/58913.37 1 1 3.28 AMP-activated proteinkinase K.IADFGLSNM*MMDGEFLR.T BGIBMGA012151 gi|114053021| 6.62/42674.16 1 1 4.52 ARP1 actin-related protein1-like protein A R.AQYVLPDGTQLEIGPAR.F BGIBMGA007039 gi|112983070| 6.06/12248.78 2 1 11.43 BCP inhibitor K.FADYTPEEQQSR.L BGIBMGA006850 gi|112983188| 6/55968.01 2 2 2.22 calcineurin A K.NAEHFSHNSVR.G gi|160358387| 8.82/59918.81 1 1 2.13 cytochrome P450 CYP6AE9 R.EVMEDYVLLDK.I gi|160358391| 8.76/59701.51 1 1 2.52 cytochrome P450 CYP6AE7 K.TMGYLFFESTTRR.G BGIBMGA010671 gi|114053117| 5.38/25082.82 2 2 11.47 eukaryotic translation initiation factor 3 subunit k K.NNLANLLGGIDDVTLK.H R.ISGFHDSIR.K BGIBMGA000619 gi|169234934| 4.77/23880.86 2 2 11.57 FK506-binding protein K.LTIPASLGYGER.G K.LVEEIFQHEDKDK.N BGIBMGA004040 gi|112983048| 8.51/14345.72 2 1 8.87 hypothetical protein LOC692620 K.LPDLWEELAIK.E BGIBMGA005301 gi|148298789| 4.25/28434.35 2 2 13.97 hypothetical protein LOC778506 K.IDDSVKPTEVAAATEEK.K K.KDDIAPEDSDIAKPETVPEVK.T gi|114051870| 6.76/28608.9 1 1 8.33 lutathione transferase o1 -.MTYFHSVNAGVIPPPALTDK.L BGIBMGA014404 gi|112983100| 4.96/45937.83 1 1 2.38 masquerade-like serine proteinase homolog R.AGEWDTQNTK.E BGIBMGA00145 gi|112982906| 6.53/44709.45 1 1 3.71 MAP kinse-ERK kinase R.IPESILGTITSAVLK.G BGIBMGA00052 gi|112983290| 5.8/87764.03 1 1 1.68 neutral endopeptidase 24.11 R.VIDLDQASLGLSR.E gi|182509212| 8.8/51474.1 1 1 2.24 olfactory receptor-like receptor R.LLLNLLLANR.R BGIBMGA008164 gi|112983896| 6.33/50013.91 2 2 6.42 putative paralytic peptide-binding protein K.VGDQQLFLIENR.E R.IPVVSVDDSDTSSYIR.D BGIBMGA004657 gi|114052160| 6.44/27143 1 1 4.47 proteasome subunit alpha type 6-A R.GTDAAVVAAQR.K BGIBMGA000867 gi|112982669| 10.46/28069.61 1 1 4.21 ribosomal protein S2 K.TYAYLTPDLWR.D BGIBMGA010178 gi|112982956| 11.37/35506.38 1 1 3.59 splicing factor arginine/serine-rich 6 R.VYVGGLPFGVR.E gi|169234928| 5.06/369030.66 1 1 0.40 titin1 K.PEEANTLRALVDK.I BGIBMGA000624 gi|148298772| 5.24/380464.73 1 1 0.36 titin2 K.PEGPVIM*REISR.E BGIBMGA011615 gi|112983266| 8.9/176184.13 2 1 1.03 topoisomerase II K.FKM*EKLEDDIVLLMSR.R BGIBMGA001584 gi|160333857| 4.81/29389.15 1 1 5.16 tropomyosin isoform 3 K.LSEASQAADESER.I gi|112984130| 6.96/33312.27 2 1 4.93 yellow6 R.QNLIPYNFIDDVIR.T Table 2 the unique expressed proteins of ND with functional annotation. Gene ID/GI number Theor.pI/Mw(kDa) No. of peptides No. of unique peptides Cover percent (%) Protein description Unique peptide sequences gi|114051770| 5.99/43014.71 4 3 11.17 26S proteasome non-ATPase regulatory subunit 13 K.FAVIDVSDFLTK.K R.ALAGAGAVLTAAR.V R.GDNLIQLYNNFLTTFESK.I BGIBMGA001490 gi|112983564| 4.75/44677.53 3 3 8.21 45 kDa immunophilin FKBP45 K.ALSGGVQIEDLK.L K.TGDSIAFLTNGK.C K.VVMVYYEGR.L BGIBMGA012298gi|114053253| 7.12/52872.82 2 1 3.31 6-phosphogluconate dehydrogenase K.NPQLTNLLLDPFFSER.I BGIBMGA008749 gi|114052224| 8.37/19958.11 1 1 6.32 adenylate cyclase R.FYADSAATLVR.Y BGIBMGA011922 gi|114052488| 6.47/40033.17 1 1 3.99 alcohol dehydrogenase R.IIGVDINPDKFEVAK.K BGIBMGA007656 gi|148298845| 11.42/20174.53 1 1 7.95 arginine/serine-rich splicing factor 7 R.NPPGFAFVEFEDPR.D BGIBMGA008670 gi|114052262| 8.31/27604.56 3 2 13.58 ATP synthase K.KENVLLQLEAAYR.E K.LAAWLDKEVEATENEWNEGR.N BGIBMGA000622 gi|168823429| 5.93/544514.58 1 1 0.23 Bm kettin K.TDIIDESVTIK.P BGIBMGA011218 gi|114052306| 4.77/84822.51 4 3 5.95 Carboxylesterase K.TETPEQAPSVK.E K.VLETSNNLFGLVIEK.E R.TANIPILVGTTSLEYACER.K BGIBMGA006916 gi|134254470| 8.54/61666 1 1 5.36 cytochrome P450 18a1 K.LPPGPWGPPVVGYLPFLGVRHKTFLQLAR.N BGIBMGA003391 gi|114052917| 9.94/12748.96 1 1 8.40 DnaJ domain-containing protein K.FDSETLANSK.Y BGIBMGA014398 gi|114052402| 5.49/49150.93 1 1 2.06 eukaryotic translation termination factor 1 K.SQEGSQFVR.G BGIBMGA007889 gi|114051800| 5.71/36908.18 2 1 8.51 eukaryotic translation initiation factor 3 subunit 2 beta K.IHSVKEHTHQITDM*QLSRDGTMFVTASK.D BGIBMGA002526 gi|112983824| 5.34/74917.21 2 1 1.54 ecdysteroid-inducible angiotensin-converting enzyme-related gene product K.NSIIEKPTDR.E BGIBMGA008780 gi|112982960| 6.75/26033.48 3 2 6.11 ferritin K.ALASLYLK.R R.IFFIHR.E BGIBMGA011438 gi|112983348| 8.72/22373.68 1 1 6.03 glutathione peroxidase R.HGPNTDPLDLVK.S BGIBMGA005781 gi|169646838| 5.15/25392.59 1 1 8.07 heat shock protein 25.4 K.IQNLPWDVNSEGSWVYEK.D BGIBMGA008736 gi|112982982| 5.12/44820.95 3 3 7.80 hemolin K.EAPAEVLFR.E K.SDFGVASTR.A R.TLATQGEDVTIPCK.A BGIBMGA004112 gi|114051598| 5.34/69314.08 1 1 2.81 leukotriene A4 hydrolase R.YLEDLIGGPEVFDDFLR.S gi|114052571| 6.43/23644.23 2 1 11.82 lysophospholipase R.HTASLIFLHGLGDTGHGWASTIAGIR.G BGIBMGA006419 gi|153792270| 6.12/67258.39 3 2 5.39 malate dehydrogenase K.YCTFNDDIQGTAAVAVAGLLASLR.L R.FAQDHAPVR.T BGIBMGA004838 gi|114052779| 10.08/48824.97 1 1 3.29 mitochondrial ribosomal protein S5 K.LWKSVTSVSNAGAK.K BGIBMGA002937 gi|182509194| 6.51/124059.24 1 1 1.82 nitric-oxide synthase like protein K.QFVSCTVKANKDLGDASAER.S gi|112984438| 5.55/203642.64 7 6 4.16 ovarian serine protease K.GPYEQIYK.V K.ISLLHLK.S K.KPLFAVSNTEGFLHR.N K.LTIEQIEQEANHICHYLGFSSAR.T K.VYVEGNYR.C R.TLYTDFDETPLFLR.E BGIBMGA008602 gi|112984330| 6.26/262354.41 1 1 0.64 p260 R.QPYCGLYGLVKKLSK.Q BGIBMGA001292 gi|114052687| 5.44/26024.4 1 1 5.75 progesterone receptor membrane component 2 -.MDSPPEVENVQAK.H BGIBMGA003028 gi|114052034| 8.29/27871.9 1 1 5.62 proteasome beta-subunit K.LTVEDPVTLEYITR.Y BGIBMGA013898 gi|114050993| 4.98/26874.64 2 1 5.76 proteasome zeta subunit K.AIGSGSEGAQQSLK.E BGIBMGA005994 gi|114052713| 5.92/37494.59 1 1 3.94 proteasome 26S non-ATPase subunit 7 R.DIKDTTVGSLSQR.I BGIBMGA007169 gi|112983398| 6.33/21828.97 1 1 5.73 ras oncoprotein K.LVVVGAGGVGK.S BGIBMGA001800 gi|112982800| 11.66/48110.93 1 1 2.57 ribosomal protein L4 K.IPELPLVVADK.V BGIBMGA002811 gi|160333861| 9.98/25492.78 1 1 13.24 ribosomal protein L10 R.ATVDDFPLCVHLVSDEYEQLSSEALEAGR.I BGIBMGA010867 gi|112984078| 10.27/29623.84 6 4 11.41 ribosomal protein S4 K.LGGVYAPR.P K.LRECLPLVIFLR.N K.YALTGNEVLK.I R.ECLPLVIFLR.N gi|112982661| 10.97/29058.07 1 1 6.72 ribosomal protein S6 R.KGAQEIPGLTDGNVPRR.L gi|114052054| 5.9/19293.48 1 1 7.69 rotamase Pin1 R.TKEEALDILQEYR.R BGIBMGA004340 gi|114052803| 8.6/11739.45 1 1 15.32 salivary cysteine-rich peptide K.SCSEPAAYGGGTGSSSK.H BGIBMGA007079 gi|114052783| 7.56/51220.85 1 1 4.09 serine hydroxymethyltransferase K.LLNSNLWEADPELFDIIVK.E BGIBMGA010212 gi|114051043| 5.28/51672.02 1 1 3.06 serine protease inhibitor serpin R.EVRIKFSTIIDSLK.K BGIBMGA006043 gi|114052122| 7.74/33484.08 1 1 4.12 stathmin R.LTLEQQTAEVYK.A BGIBMGA011499 gi|112984508| 9.03/50600.85 2 1 2.78 translation initiation factor 2 gamma subunit R.TTQSNLHQQDLSK.L BGIBMGA010283 gi|148298831| 7/88104.06 1 1 1.02 tuftelin interacting protein 11 K.M*FPEDILK.E BGIBMGA005794 gi|151301088| 7.17/22241.48 1 1 5.10 ubiquinone biosynthesis protein COQ7-like protein R.TLM*QDPNVDK.E gi|114052396| 6.83/20898.87 3 2 10.38 ubiquitin-conjugating enzyme E2M K.DLNELNLPK.T K.EAADELQSNR.R BGIBMGA005628 gi|114050831| 7.52/52953.59 1 1 4.86 uridine 5'-monophosphate synthase K.KDDTCLIIEDVITSGSSILETVK.D BGIBMGA013190 gi|114053067| 5.95/20271.21 2 1 5.59 vacuolar protein sorting 29 K.TLASDVHVVR.G BGIBMGA013964 gi|114050729| 8.46/44007.51 1 1 4.15 vacuolar ATPase subunit C R.YGLPVNFQAVVM*VPAR.K BGIBMGA010247 gi|114052088| 8.98/26119.27 1 1 5.75 vacuolar ATP synthase subunit E R.LELIAQQLLPEIR.N BGIBMGA004036 gi|112983588| 5.83/65983.72 1 1 1.83 Vlg protein K.VAVAYGGTAVR.H Theoretical Two-dimensional Distribution of the Identified Proteins The theoretical isoelectric point (pI) and molecular weight (Mw) of the identified proteins were calculated using the Compute pI/Mw tool (http://cn.expasy.org/tools/pi_tool.html) according to the predicted amino acid sequences. It showed that 87.30% of the total proteins were distributed in the range of pI 4–7 and 8–10 (Fig. 3a). About 60.87% of the proteins were distributing in the range of 15–60 kDa (Fig. 3b). In the two kinds of eggs, less than 7.62% of proteins showed pI 7–8. Furthermore, 25 and 29 proteins with higher pI (more than 10), usually difficult to be separated by 2-DE, were also identified from D and ND respectively. It also revealed that the protein distributions were nearly identical in the D and ND samples. Theoretical pI and Mw distribution of the identified proteins. (a) Distribution of pI. (b) Distribution of Mw. The theoretical isoelectric point (pI) and molecular weight (Mw) of the proteins were calculated using the Compute pI/Mw tool (http://cn.expasy.org/tools/pi_tool.html) according to the predicted amino acid sequences. Profiling of Common-expressed Proteins in the D and ND Samples On the list of common-expressed proteins of D and ND with annotations, many functional proteins were detected (Table S1). Among these common-expressed proteins we identified, vitellogenin precursor (gi:112983746), egg-specific protein (gi:187281695) and low molecular lipoprotein 30 K precursor (gi:156119320, gi:162461355 and gi:112984502) which comprise the three main storage proteins in the silkworm eggs. The relative concentration of a protein identified by MS in a biological sample is directly related to the number of identified peptides, neglecting possible effects such as the enzymatic digestion constraint, the detection mass range of the mass spectrometer and differential post-translational modifications. Therefore the number of identified unique peptides assembled into a protein may reflect the protein’s relative abundance. So according to the amount of identified unique peptides, vitellogenin precursor, egg-specific protein, low molecular lipoprotein 30 K precursor, and some predicted proteins such as BGIBMGA013342-PA, BGIBMGA004585-PA and BGIBMGA004399-PA are highly represented in the samples. In these predicted proteins, the BGIBMGA013342-PA and BGIBMGA004585-PA were involved in lipid transport (GO: 0006869) with the molecular Function of lipid transporter activity (GO: 0005319), while the BGIBMGA004399-PA was identified to be a low molecular weight lipoprotein which located at the extracellular region (GO: 0005576).These highly-represented proteins were important at the early stage of the embryogenesis. Moreover, many heat shock proteins were identified, such as heat shock protein HSP19.9, HSP20.1, HSP20.4, HSP20.8, HSP23.7, HSP70, HSP90 and heat shock cognate protein. These proteins, HSP19.9, HSP20.1, HSP20.4, HSP20.8, HSP23.7 belong to a family of small heat shock protein (sHSP) which mainly function as molecular chaperones to protect proteins from being denatured during extreme conditions, especially under high temperature stress. The sHSP family is functionally more diverse than other HSPs. Moreover sHSPs can also develop a protection function under other stress conditions, such as cold, drought, oxidation, hypertonic stress, UV, and heavy metals, and even under high population densities of organisms.The HSPs including sHSPs also play an important role in normal development. However, insect orthologous sHSPs may not be associated with response to environmental stresses and may be involved in basic metabolic processes. Moreover, silkworm sHSPs may play an important role in the development of the germocyte and have functions in immune defense mechanisms. HSP70, heat shock cognate protein,HSP90 and sHSPs are associated with diapause in a number of species.But in this study, some HSPs were highly expressed both in diapause and non-diapause eggs, including HSP90, HSP20.8, HSP20.4 and HSP70. A possible explanation is that these HSPs may play an important function in the initial embryos development regardless of diapause or non-diapause. Furthermore, among the common-expressed proteins with functional annotations, 37 metabolic enzymes were identified (11.97% of the common-expressed proteins with functional annotations). The metabolic enzymes include ADP/ATP translocase, fibroinase, glucose-6-phosphate isomerase, glutamate dehydrogenase, isocitrate dehydrogenase, salivary secreted ribonuclease, transketolase etc. The identification of these essential metabolic enzymes indicated that during embryogenesis stage metabolism was active in the eggs. Many ribosomal proteins (L7, L7A, L9, L10A, L11, L13, L13A, L15, L17, L18, L19, L23A, L38, S3, S5, S7, S8, S9, S15A, S27, S28, and S30) were found in D and ND samples. This abundance of ribosomal proteins on the first day after oviposition might be closely related to protein synthesis during embryogenesis. In addition, 62 proteins that may also play important roles in the egg development were identified commonly, such as 14-3-3 epsilon protein, antennal binding protein, calreticulin, cellular retinoic acid binding protein, chemosensory protein 11, exuperantia, perilipin, profiling, transferring and etc. 14-3-3 epsilon protein (Bm14-3-3ε) is a member of the 14-3-3 protein family. The 14-3-3 proteins along with partner proteins(for example CDK11, PFTK1, GSK3beta, Chk1) are involved in the regulation of several crucial cellular processes including metabolism, signal transduction, cell development, differentiation, apoptosis, transcription, stress responses and malignant transformation. In this study, we identified Bm14-3-3ε in the D and ND samples. This result indicated that Bm14-3-3ε may play a role in regulating embryonic development of silkworm. Another important protein was translationally controlled tumor protein (TCTP), a highly conserved protein upregulated in various tumours. Hsu et al reported that reducing Drosophila TCTP (dTCTP) levels will reduces its cell size, cell number and organ size. Further more, calreticulin located in the endoplasmic reticulum, has been implicated in many diverse functions, including: regulation of intracellular Ca2+ homeostasis, chaperone activity, steroid-mediated gene regulation, and cell adhesion. The highest level of mRNA expression of calreticulin was exhibited in the fat body of Bombyx mori . Our result showed that calreticulin was identified during the initial embryogenesis both in D and ND samples, which means that calreticulin is essential during embryonic development. Profiling of D-Unique Proteins The characteristic proteins with functional annotations unique to the D samples are shown in Table 1. Among the D-unique proteins, BCP inhibitor (BCP1, Bombyx cysteine proteinase inhibitor) is inhibitory towards the processing of the enzymatically inactive proform of BCP (pro-BCP) to the activated mature BCP but has no effects on trypsin and pepsin, and is highly expressed in the metamorphosis stage of B.mori . In our study, BCPI which was exclusively identified in the D samples may be involved in regulating the development of diapause-destined embryos to arrest during the G2 cell cycle stage. Calcineurin A and FK506-binding protein were other proteins which were uniquely identified in the D sample. Calcineurin A with four EF-hand type calcium-binding structures is localized in the cytoplasm of the pheromone-producing cells and participates in the intracellular signal transduction of PBAN (Pheromone biosynthesis activating neuropeptide) in B. mori . The FK506-binding protein (FKBP) belongs to a ubiquitous family of proteins which participates in a variety of pathways, including protein folding, down-regulation of T-cell activation and inhibition of cell-cycle progression. Both Calcineurin A and FK506-binding protein identified in the diapause eggs are related with the Ca2+ release, suggesting that in the diapause eggs the regulation of the Ca2+ might be more active than in the non-diapause eggs. Other proteins identified in the diapause eggs were cytochrome P450 CYP6AE9, cytochrome P450 CYP6AE7, titin1, titin2, topoisomerase II etc (Table 1). In the diapause eggs, the cephalic lobe and telson and sequential segmentation of the mesoderm are formed before the embryos enters into diapause. In another words, the development of the diapause eggs are on one hand moving toward embryonic development, while on the other hand preparing to enter diapause after oviposition. Thus, these D-unique proteins may be adapted to this physiological need. Profiling of ND-Unique Proteins Compared with the unique proteins of the D samples, the characteristic proteins unique to the ND samples are shown in Table 2. There were more metabolic enzymes identified in the ND samples than in the D samples, including 6-phosphogluconate dehydrogenase, adenylate cyclase, alcohol dehydrogenase, ATP synthase(subunit B), carboxylesterase, glutathione peroxidase, malate dehydrogenase, ovarian serine protease, serine hydroxylmethyltransferase etc (Table 2). In the non-diapause eggs, the development of the embryos was consistent and without interruption, so the metabolism of the non-diapause eggs was potentially more active than the diapause eggs. In addition, we also identified eukaryotic translation initiation factor 3 subunit 2 beta, translation initiation factor 2 gamma subunit, eukaryotic translation termination factor 1 and more ribosomal proteins (L4, L10, S4, S6) in the non-diapause eggs. These components are important in regulating translation and protein synthesis, indicating that translation of the mRNA and synthesis of proteins was potentially more active in the non-dipause eggs than that in the diapause eggs. Proteasome beta-subunit, proteasome zeta subunit and proteasome 26S non-ATPase subunit 7 were also identified only in the non-dipause eggs, which play important roles in degrading cytosolic and nuclear proteins previously labeled with ubiquitin molecules. Other proteins identified in the non-diapause eggs were ecdysteroid-inducible angiotensin-converting enzyme-related gene product whose transcription was directly induced by 20-hydroxyecdysone (20E), ubiquitin-conjugating enzyme E2M, stathmin and a ras oncoprotein, all of which are associated with embryonic development. Quantitative RT-PCR Analysis The dipause-unique gene BmnitA and the non-diapause-unique genes BmADK2, BmIDH and BmGGT were selected to detect their relative mRNA levels by qRT-PCR analysis. Relative gene expression levels were showed in Fig. 4. In the diapause eggs, BmnitA, BmADK2 and BmGGT mRNA relative expression levels were significantly increased compared with those in non-diapause eggs according to the development stage, while BmIDH mRNA relative expression levels were remained constant compared with those in non-diapause eggs according to the development stage, which indicated that BmIDH is a possible non-diapause-unique gene. Interestingly at the initial embryogenesis, the mRNA relative expressions of these four genes were all significantly higher in the non-diapause eggs than in the diapause eggs. In the non-diapause eggs, BmIDH mRNA relative expression was significantly higher more than 21-fold than that in diapause eggs, BmnitA mRNA relative expression was significantly higher more than 21-fold than that in diapause eggs, BmADK2 mRNA relative expression was significantly higher more than 1795-fold than that in diapause eggs, BmGGT mRNA relative expression was significantly higher more than 57-fold than that in diapause eggs. However, the mRNA levels were not always consistent with their protein expressions. In the mRNA level, the expression of these four genes can be detected whether in diapause or non-dipause eggs. Gene expression level differences of the identified proteins in D and ND eggs of 1, 3, 5, 7 days after oviposition. The mRNA levels of the target genes were normalized by subtracting Ct value of the reference gene, BmactinA3 (No.X04507). D means diapause and N means non-diapause.The (2-ΔΔCT values represents relative gene expression level. Values are means ± SD (standard deviations). Gene Ontology Analysis of the Functional Categories Gene ontology (GO) is now widely used to describe the function of genes and gene products in a standardized format (http://www.geneontology.org). To understand the functions of the proteins we identified, the protein sequences were queried against the InterPro databases and the resultant proteins were functionally categorized based on universal GO annotation terms using the online GO tool WEGO. GO signatures of 486 out of the 501 and 545 out of the 562 proteins identified in the diapause and non-diapause eggs respectively were available. They were classified into Cellular Component, Molecular Function and Biological Process according to the GO hierarchy using WEGO (Fig. 5). Gene ontology categories for the identified proteins of D and ND. The identified proteins were classified into cellular component, molecular function and biological process by WEGO according to the GO terms. The number of genes is the number of times the GO term is used to annotate genes in the cluster. The left-hand shows its proportion in total genes of related genes with GO terms. In the Cellular Component category, proteins mapping to cell, cell part, macromolecular complex and organelle related GO terms were the most abundant, mapping to membrane-enclosed lumen were the fewest. In the subcategory of cell part, 114 and 131 proteins of D and ND respectively were ascribed to intracellular. In the subcategory of organelle, 30 and 35 proteins were separately assigned to membrane-bounded organelle, and 39 and 48 proteins were separately assigned to non-membrane-bounded organelle. According to the Molecular Function category, most proteins were addressed to binding (194/231 in the D and ND samples respectively) and catalytic activity (177/215 in the D and ND samples respectively), especially the nucleotide, nucleoside, chromatin, ion and nucleic acid binding and hydrolase, oxidoreductase, transferase activities. The groups with much fewer terms (only one protein annotated) include small protein activating enzyme activity, chromatin binding, odorant binding, metal cluster binding, enzyme activator activity, phosphatase regulator activity, transcription repressor activity, transcription initiation factor activity proteins. Moreover, one protein with deaminase activity, two proteins with cyclase activity and one protein with thioredoxin-disulfide reductase activity were annotated functional proteins only in the non-dipause eggs. Considering the Biological Process category, most of proteins were involved in the metabolic and cellular process. Much fewer proteins were involved in the reproduction, reproductive, biological adhesion and developmental process. In the metabolic process subcategories, most proteins were related to primary metabolic, cellular metabolic and macromolecule metabolic process. In the cellular process subcategories, most proteins were associated with the cellular metabolic process followed by the cell communication process. Besides, only one protein was functionally annotated to each of secondary metabolic process, cellular component disassembly, translational initiation, membrane docking and secretion by cell in the diapause eggs. GO analysis on the identified proteins presented an overall view on the functional categories of the diapause and non-diapause egg proteomes. In addition, proteins with GO annotation in the diapause eggs were not significantly different (p<0.05) to those in the non-diapause eggs, indicating that the expressed proteins during the early stages of embryonic development have many functional similarities between diapause and non-diapause eggs. KEGG Pathway Analysis KEGG is a large resource contains information for various model organisms about Molecular Interactions, Reaction Networks, Cellular Processes and Human Diseases. In the present study, a total of 689 of 754 proteins were subjected to query against the KEGG reference pathway database and 192 non-redundant pathways were indicated (Table S2) that most of them were related to Metabolism and Organism Systems. Other pathways such as the Genetic Information Processing, Environmental Information Processing, Cellular Processes and Human Diseases were also detected. 42 and 29 KEGG pathways were involve only in the diapause eggs unique proteins and non-diapause eggs unique proteins, respectively (Table 3, 4). Table 3 KEGG pathways of diapause unique proteins. Annotation Total Genes Metabolism Lipid Metabolism; Steroid hormone biosynthesis 3 Lipid Metabolism; Linoleic acid metabolism 3 Xenobiotics Biodegradation and Metabolism; Caprolactam degradation 20 Lipid Metabolism; Biosynthesis of unsaturated fatty acids 14 Environmental Information Processing Signal Transduction; ErbB signaling pathway 20 Signal Transduction; Phosphatidylinositol signaling system 20 Signal Transduction; mTOR signaling pathway 20 Signal Transduction; VEGF signaling pathway 40 Cellular Processes Cell Growth and Death; Cell cycle - Caulobacter 20 Transport and Catabolism; Regulation of autophagy 20 Cell Communication; Focal adhesion 21 Cell Motility; Regulation of actin cytoskeleton 21 Genetic Information Processing Folding, Sorting and Degradation; Sulfur relay system 20 Organismal Systems Development; Dorso-ventral axis formation 20 Development; Axon guidance 21 Development; Osteoclast differentiation 40 Immune System; Complement and coagulation cascades 9 Immune System; Toll-like receptor signaling pathway 20 Environmental Adaptation; Plant-pathogen interaction 20 Immune System; Natural killer cell mediated cytotoxicity 40 Immune System; T cell receptor signaling pathway 40 Immune System; B cell receptor signaling pathway 40 Immune System; Fc epsilon RI signaling pathway 20 Immune System; Leukocyte transendothelial migration 1 Endocrine System; Adipocytokine signaling pathway 39 Human Diseases Pathogenic Escherichia coli infection 1 Infectious Diseases; Shigellosis 1 Infectious Diseases; Hepatitis C 19 Infectious Diseases; Measles 19 Cancers; Pathways in cancer 20 Cancers; Colorectal cancer 20 Cancers; Renal cell carcinoma 20 Cancers; Pancreatic cancer 20 Cancers; Endometrial cancer 20 Cancers; Glioma 20 Cancers; Prostate cancer 20 Cancers; Thyroid cancer 20 Cancers; Melanoma 20 Cancers; Bladder cancer 20 Cancers; Chronic myeloid leukemia 20 Cancers; Acute myeloid leukemia 20 Cancers; Non-small cell lung cancer 20 KEGG pathways of non-diapause unique proteins. Annotation Total Genes Metabolism Lipid Metabolism; Fatty acid biosynthesis 18 Metabolism of Cofactors and Vitamins; Ubiquinone and other terpenoid-quinone biosynthesis 19 Amino Acid Metabolism; Phenylalanine metabolism 6 Metabolism of Other Amino Acids; Selenocompound metabolism 40 Metabolism of Other Amino Acids; Cyanoamino acid metabolism 35 Lipid Metabolism; Arachidonic acid metabolism 51 Carbohydrate Metabolism; Glyoxylate and dicarboxylate metabolism 20 Metabolism of Terpenoids and Polyketides; Terpenoid backbone biosynthesis 20 Biosynthesis of Other Secondary Metabolites; Indole alkaloid biosynthesis 6 Biosynthesis of Other Secondary Metabolites; Isoquinoline alkaloid biosynthesis 6 Biosynthesis of Other Secondary Metabolites; Betalain biosynthesis 6 Metabolism of Terpenoids and Polyketides; Insect hormone biosynthesis 37 Genetic Information Processing Translation; RNA transport 20 Translation; mRNA surveillance pathway 20 Transcription; Spliceosome 20 Environmental Information Processing Signaling Molecules and Interaction; Cytokine-cytokine receptor interaction 2 Signal Transduction; Hedgehog signaling pathway 20 Cellular Processes Cell Growth and Death; Meiosis - yeast 20 Transport and Catabolism; Endocytosis 19 Organismal Systems Immune System; RIG-I-like receptor signaling pathway 20 Sensory System; Olfactory transduction 20 Sensory System; Taste transduction 20 Sensory System; Phototransduction - fly 20 Excretory System; Aldosterone-regulated sodium reabsorption 19 Excretory System; Endocrine and other factor-regulated calcium reabsorption 58 Digestive System; Carbohydrate digestion and absorption 19 Human Diseases Infectious Diseases; Bacterial invasion of epithelial cells 19 Infectious Diseases; Toxoplasmosis 1 Infectious Diseases; Amoebiasis 20 The metabolism pathways in non-diapause eggs were more highly represented than which in diapause eggs. The amino acid metabolism, carbohydrate metabolism and biosynthesis of other secondary metabolites was more highly represented in the non-diapause eggs, meanwhile the lipid metabolism pathway such as steroid hormone biosynthesis, linoleic acid metabolism and biosynthesis of unsaturated fatty acids was more highly represented in the diapause eggs. The translation and transcription pathway such as RNA transport, mRNA surveillance pathway and spliceosome were only detected in the non-diapause eggs. The sensory system (olfactory transduction, taste transduction and phototransduction), excretory system and digestive system were also detected in the non-diapause eggs, which indicated that the embryonic development was in progress. ErbB signaling pathway, phosphatidylinositol signaling system, mTOR signaling pathway and VEGF signaling pathway which belonged to signal transduction pathway were more active in the diapause eggs. ErbB signaling regulates diverse cellular functions, such as proliferation, migration, differentiation and survival/death, and participates in various developmental processes during both invertebrate and vertebrate early embryogenesis. Interestingly, development and immune system pathways were also more represented in the diapause eggs, maybe the trigger of immune system pathway was the self-protection of diapause eggs during the long diapause stage. Conclusions This study provides a catalogue and an initial analysis of the proteomes of the diapause and non-diapause eggs during embryogenesis of the silkworm by shotgun proteomic analysis. Unique proteins identified in the two kinds of eggs and common proteins shared by each other were identified and analyzed. The relative mRNA levels of four identified proteins in the two kinds of eggs were also compared using qRT-PCR and showed some inconsistencies with protein expression. GO analysis of these proteins also provided us with a global view of their functions. In addition, KEGG pathway analysis showed the Metabolism, Translation and Transcription pathway were potentially more highly represented in the non-dipause eggs at this stage. These results will also help further research on finding the diapause-associated proteins during egg development of silkworm. However, the shotgun proteomic analysis has some shortcomings, for example, it is too complex for protein assembly and it mainly depends on bioinformatic methods. With the development of genomics and bioinformatics, the shotgun LC-MS/MS will be a promising strategy in proteomics research. Supporting Information

projects that include this document

Unselected / annnotation Selected / annnotation