PMC:5118444 / 11017-14524 JSONTXT

Annnotations TAB JSON ListView MergeView

    2_test

    {"project":"2_test","denotations":[{"id":"27920798-22959562-33402009","span":{"begin":1568,"end":1572},"obj":"22959562"},{"id":"27920798-24128763-33402010","span":{"begin":1574,"end":1578},"obj":"24128763"},{"id":"27920798-19289445-33484680","span":{"begin":2378,"end":2382},"obj":"19289445"},{"id":"27920798-21697122-33484681","span":{"begin":2438,"end":2442},"obj":"21697122"}],"text":"Tissue specificity analysis\nTissue specificity of genes overlapped by CNVRs was assessed using two types of expression data, microarray and RNA sequencing, encompassing 22 different tissues (Table 2). Raw data sets for experiments GSE41637, GSE55435, GSE71153, GSE73699, GSE73261, and GSE73159 were downloaded from NCBI's Gene Expression Omnibus (www.ncbi.nlm.nih.gov/geo), and the raw data for experiment ERP005899 was downloaded from EMBL-EBI's European Nucleotide Archive (http://www.ebi.ac.uk/ena).\nTable 2 Gene expression data sets.\nStudy Tissue Data type Number of samples\nGSE73699 Mesenteric fat Microarray 15\nGSE73261 Spleen* Microarray 16\nGSE73159 Duodenum Microarray 16\nJejunum Microarray 16\nIleum Microarray 16\nGSE41637 Brain RNAseq 3\nColon RNAseq 3\nHeart RNAseq 3\nKidney* RNAseq 3\nLiver* RNAseq 3\nLung* RNAseq 3\nSkeletal muscle RNAseq 3\nSpleen* RNAseq 3\nTestes RNAseq 2\nGSE55435 Hypothalamus* RNAseq 8\nPituitary gland RNAseq 7\nUterus RNAseq 8\nEndometrium RNAseq 6\nOvary RNAseq 8\nSubcataneous fat RNAseq 8\nLiver* RNAseq 8\nLongissimus dorsi muscle RNAseq 8\nGSE71153 Rumen RNAseq 16\nERP005899 Adipose RNAseq 7~14 pooled\nDuodenum* RNAseq 7~14 pooled\nHypothalamus* RNAseq 7~14 pooled\nKidney* RNAseq 7~14 pooled\nLung* RNAseq 7~14 pooled\nTissues marked with *were present in multiple studies. The microarray data (experiments GSE73699, GSE73261, and GSE73159) was processed as follows. Individual CEL files were processed using the UPC function from the SCAN.UPC package in R (Piccolo et al., 2012, 2013). UPC is a quantitative approach for normalizing gene expression data that produces standardized expression values that estimate whether a gene is “active” in a given sample. The program outputs for each gene in a given sample a universal expression code (UPC), a number between 0 and 1 where larger values suggest a greater likelihood that the gene is expressed in the sample. The UPC function was run using the default parameters, and for each tissue a gene was considered to be expressed in the tissue if it had a UPC \u003e 0.5 in at least one sample.\nThe RNA sequencing data (experiments GSE41637, GSE55435, GSE71153, and ERP005899) was processed as follows. Raw sequence reads in individual fastq files were first mapped to the UMD 3.1 genome assembly using Tophat (Version 2.0.1; Trapnell et al., 2009). The Cufflinks software (Version 2.2; Roberts et al., 2011) was then used to compute the fragments per kilobase of transcript per million mapped reads (FPKM) for paired-end reads and the analogous reads per kilobase of transcript per million mapped reads (RPKM) for single-end reads. Both software packages were run using the default parameters, and for each tissue a gene was considered expressed in the tissue if it had FPKM or RPKM \u003e 1.0 in at least one sample. Note that some tissues, including duodenum, hypothalamus, kidney, liver, lung, and spleen, were included in two of the experiments. For these tissues, a gene was considered expressed if it passed the expression criterion in at least one of the two experiments. Genes belonging to both the set of expressed genes and our CNV gene set were classified as expressed CNV genes, while genes that were expressed but not overlapped by CNVs were classified as expressed neutral genes. The P-values from a one-tailed Wilcoxon rank-sum test were used to test the hypothesis that expressed CNV genes in cattle are expressed in fewer tissues than expressed neutral genes."}

    MyTest

    {"project":"MyTest","denotations":[{"id":"27920798-22959562-33402009","span":{"begin":1568,"end":1572},"obj":"22959562"},{"id":"27920798-24128763-33402010","span":{"begin":1574,"end":1578},"obj":"24128763"},{"id":"27920798-19289445-33484680","span":{"begin":2378,"end":2382},"obj":"19289445"},{"id":"27920798-21697122-33484681","span":{"begin":2438,"end":2442},"obj":"21697122"}],"namespaces":[{"prefix":"_base","uri":"https://www.uniprot.org/uniprot/testbase"},{"prefix":"UniProtKB","uri":"https://www.uniprot.org/uniprot/"},{"prefix":"uniprot","uri":"https://www.uniprot.org/uniprotkb/"}],"text":"Tissue specificity analysis\nTissue specificity of genes overlapped by CNVRs was assessed using two types of expression data, microarray and RNA sequencing, encompassing 22 different tissues (Table 2). Raw data sets for experiments GSE41637, GSE55435, GSE71153, GSE73699, GSE73261, and GSE73159 were downloaded from NCBI's Gene Expression Omnibus (www.ncbi.nlm.nih.gov/geo), and the raw data for experiment ERP005899 was downloaded from EMBL-EBI's European Nucleotide Archive (http://www.ebi.ac.uk/ena).\nTable 2 Gene expression data sets.\nStudy Tissue Data type Number of samples\nGSE73699 Mesenteric fat Microarray 15\nGSE73261 Spleen* Microarray 16\nGSE73159 Duodenum Microarray 16\nJejunum Microarray 16\nIleum Microarray 16\nGSE41637 Brain RNAseq 3\nColon RNAseq 3\nHeart RNAseq 3\nKidney* RNAseq 3\nLiver* RNAseq 3\nLung* RNAseq 3\nSkeletal muscle RNAseq 3\nSpleen* RNAseq 3\nTestes RNAseq 2\nGSE55435 Hypothalamus* RNAseq 8\nPituitary gland RNAseq 7\nUterus RNAseq 8\nEndometrium RNAseq 6\nOvary RNAseq 8\nSubcataneous fat RNAseq 8\nLiver* RNAseq 8\nLongissimus dorsi muscle RNAseq 8\nGSE71153 Rumen RNAseq 16\nERP005899 Adipose RNAseq 7~14 pooled\nDuodenum* RNAseq 7~14 pooled\nHypothalamus* RNAseq 7~14 pooled\nKidney* RNAseq 7~14 pooled\nLung* RNAseq 7~14 pooled\nTissues marked with *were present in multiple studies. The microarray data (experiments GSE73699, GSE73261, and GSE73159) was processed as follows. Individual CEL files were processed using the UPC function from the SCAN.UPC package in R (Piccolo et al., 2012, 2013). UPC is a quantitative approach for normalizing gene expression data that produces standardized expression values that estimate whether a gene is “active” in a given sample. The program outputs for each gene in a given sample a universal expression code (UPC), a number between 0 and 1 where larger values suggest a greater likelihood that the gene is expressed in the sample. The UPC function was run using the default parameters, and for each tissue a gene was considered to be expressed in the tissue if it had a UPC \u003e 0.5 in at least one sample.\nThe RNA sequencing data (experiments GSE41637, GSE55435, GSE71153, and ERP005899) was processed as follows. Raw sequence reads in individual fastq files were first mapped to the UMD 3.1 genome assembly using Tophat (Version 2.0.1; Trapnell et al., 2009). The Cufflinks software (Version 2.2; Roberts et al., 2011) was then used to compute the fragments per kilobase of transcript per million mapped reads (FPKM) for paired-end reads and the analogous reads per kilobase of transcript per million mapped reads (RPKM) for single-end reads. Both software packages were run using the default parameters, and for each tissue a gene was considered expressed in the tissue if it had FPKM or RPKM \u003e 1.0 in at least one sample. Note that some tissues, including duodenum, hypothalamus, kidney, liver, lung, and spleen, were included in two of the experiments. For these tissues, a gene was considered expressed if it passed the expression criterion in at least one of the two experiments. Genes belonging to both the set of expressed genes and our CNV gene set were classified as expressed CNV genes, while genes that were expressed but not overlapped by CNVs were classified as expressed neutral genes. The P-values from a one-tailed Wilcoxon rank-sum test were used to test the hypothesis that expressed CNV genes in cattle are expressed in fewer tissues than expressed neutral genes."}