> top > docs > @ewha-bio:238

@ewha-bio:238 JSON TXT

Annnotations TAB JSON ListView MergeView

Genomics_Informatics

{"project":"Genomics_Informatics","denotations":[{"id":"LappsGridBioNER_protein1","span":{"begin":331,"end":335},"obj":"Protein"},{"id":"LappsGridBioNER_protein2","span":{"begin":346,"end":350},"obj":"Protein"},{"id":"LappsGridBioNER_protein3","span":{"begin":355,"end":370},"obj":"Protein"},{"id":"LappsGridBioNER_protein4","span":{"begin":862,"end":866},"obj":"Protein"},{"id":"LappsGridBioNER_protein5","span":{"begin":1569,"end":1573},"obj":"Protein"},{"id":"LappsGridBioNER_protein6","span":{"begin":1795,"end":1802},"obj":"Protein"},{"id":"LappsGridBioNER_protein7","span":{"begin":3584,"end":3588},"obj":"Protein"},{"id":"LappsGridBioNER_protein8","span":{"begin":4609,"end":4613},"obj":"Protein"},{"id":"LappsGridBioNER_protein9","span":{"begin":4676,"end":4679},"obj":"Protein"},{"id":"LappsGridBioNER_protein10","span":{"begin":5189,"end":5200},"obj":"Protein"},{"id":"LappsGridBioNER_protein11","span":{"begin":5522,"end":5525},"obj":"Protein"},{"id":"LappsGridBioNER_protein12","span":{"begin":6017,"end":6021},"obj":"Protein"},{"id":"LappsGridBioNER_protein13","span":{"begin":6119,"end":6125},"obj":"Protein"},{"id":"LappsGridBioNER_protein14","span":{"begin":10104,"end":10108},"obj":"Protein"},{"id":"LappsGridBioNER_protein15","span":{"begin":10144,"end":10153},"obj":"Protein"},{"id":"LappsGridBioNER_protein16","span":{"begin":13085,"end":13089},"obj":"Protein"},{"id":"LappsGridBioNER_protein17","span":{"begin":13578,"end":13582},"obj":"Protein"},{"id":"LappsGridBioNER_protein18","span":{"begin":14350,"end":14354},"obj":"Protein"},{"id":"LappsGridBioNER_protein19","span":{"begin":15761,"end":15785},"obj":"Protein"},{"id":"LappsGridBioNER_protein20","span":{"begin":15794,"end":15819},"obj":"Protein"},{"id":"LappsGridBioNER_protein21","span":{"begin":15821,"end":15824},"obj":"Protein"},{"id":"LappsGridBioNER_protein22","span":{"begin":15830,"end":15854},"obj":"Protein"},{"id":"LappsGridBioNER_protein23","span":{"begin":16660,"end":16664},"obj":"Protein"},{"id":"LappsGridBioNER_protein24","span":{"begin":16770,"end":16781},"obj":"Protein"},{"id":"LappsGridBioNER_protein25","span":{"begin":19236,"end":19246},"obj":"Protein"},{"id":"LappsGridBioNER_protein26","span":{"begin":20002,"end":20020},"obj":"Protein"}],"text":"\n=============Title==========\nCopy Number Variations in the Human Genome: Potential Source for Individual Diversity and Disease Association Studies.\n=============Cor Author==========\n*Corresponding author: E-mail yejun@catholic.ac.krTel +82-2-590-1214, Fax +82-2-596-8969 Accepted 11 March 2008\n===========Author==========\nTae-Min Kim1, Seon-Hee Yim2 and Yeun-Jun Chung1,2*1Department of Microbiology, 2Integrated Research Center for Genome Polymorphism, The Catholic University of Korea, Seoul 137-701, Korea\n===========Keywords==========\nKeywords: array-CGH, Copy number variation (CNV), Genome-wide association study (GWAS)Keywords: chromosome, genome-wide linkage search, heritability, HDL cholesterol\n===========Sub Heading==========\nAbstract\tIntroduction\tThe definition of CNV\tThe identification of CNVs using differ-ent platforms\tClinical implications of CNVs and dis-ease association study\tConclusion\tIntroduction\tMethods\tResults and Discussion\t\n==========Minor Heading===========\nASubjects, medical histories, genotyping, and measurement of HDL cholesterol\tStatistical analyses, heritability estimation, and variance component linkage analysis\t\n===========Main Text==========\n\n\nAbstract.\n\n\n\nThe widespread presence of large-scale genomic variations, termed copy number variation (CNVs), has been recently recognized in phenotypically normal individuals.\n\n\n\nJudging by the growing number of reports on CNVs, it is now evident that these variants contribute significantly to genetic diversity in the human genome.\n\n\n\nLike single nucleotide polymorphisms (SNPs), CNVs are expected to serve as potential biomarkers for disease susceptibility or drug responses.\n\n\n\nHowever, the technical and practical concerns still remain to be tackled.\n\n\n\nIn this review, we examine the current status of CNV DBs and research, including the ongoing efforts of CNV screening in the human genome.\n\n\n\nWe also discuss the characteristics of platforms that are available at the moment and suggest the potential of CNVs in clinical research and application.\n\n\n\nIIntroduction.\n\n\n\nTraditionally, large-scale genomic variants that are visible in conventional karyotyping have been thought to be associated with early-onset, highly penetrant genetic disorders, while they are incompatible in normal, disease-free individuals (Lupski, 1998; Stankiewicz and Lupski, 2002).\n\n\n\nThe construction of the 'reference genome' by the human genome sequencing project is based on the belief that human genome sequences are virtually identical, even in different individuals, except for well-known single nucleotide polymorphisms (SNP) or size-variants of tandem repeats such as mini- or microsatellites (variable number of tandem repeats or VNTR) (Przeworski et al., 2000).\n\n\n\nThis traditional concept has been recently challenged by the discovery that large structural variations are more prevalent than previously presumed (Check, 2005).\n\n\n\nUsing high-resolution whole- genome scanning technologies such as array-based comparative genomic hybridization (array-CGH), two groups of pioneering scientists have identified widespread copy number variations (CNVs) in apparently healthy, normal individuals (Iafrate et al., 2004; Sebat et al., 2004).\n\n\n\nIt proposes that our genome is more diverse than has ever been recognized, and subsequent studies have identified up to 11,000 CNVs across the whole genome (Tuzun et al., 2005; Hinds et al., 2006; Mills et al., 2006; McCarroll et al., 2006; Conrad et al., 2006; Sharp et al., 2005; Wong et al., 2007; de Smith et al., 2007).\n\n\n\nAlthough the current understanding of CNVs is still limited for practical use and technical challenges still remain to be tackled, recent studies already have demonstrated the potential association of CNVs with various diseases, suggesting plausible functional significances and highlighting the promising utility of CNVs.\n\n\n\nThe current coverage of CNVs in the human genome already has exceeded that of SNPs (approximately 600 Mb comprising 12% of human genome) and is still increasing (Cooper et al., 2007).\n\n\n\nThese large-scale structural variants, in addition to SNPs, will serve as powerful sources to help our understanding of human genetic variation and of differences in disease susceptibility for various diseases.\n\n\n\nThis paper reviews the current knowledge and future perspectives of CNVs.\n\n\n\nThe definition of CNV.\n\n\n\nStructural variations that involve large DNA segments can take various forms, such as duplication, deletion, insertion, inversion, and translocation.\n\n\n\nAmong them, DNA copy number variations larger than 1 kb are collectively termed CNVs.\n\n\n\nFig.\n\n\n\n1 illustrates the concept of CNV.\n\n\n\nAlthough the CNV can include large, microscopically visible genomic variations, it generally indicates a submicroscopic structural variation that is hardly detectable by conventional karyotyping (35 Mb) (Freeman et al., 2006).\n\n\n\nSmaller variations such as small insertional- deletion (indel) polymorphisms are not included in CNVs, while they comprise another large collection of over 400,000 variants in the human genome (Mills et al., 2006), and neither is the insertional polymorphism of mobile elements such as Alus or L1 elements considered a CNV.\n\n\n\nAt the beginning stages of CNV discovery, a number of terms were proposed to define them e.g., large-scale copy number variants (LCV) (Iafrate et al., 2004), copy number polymorphism (CNP) (Sebat et al., 2004), and intermediate-sized variants (ISV) (Tuzun et al., 2005).\n\n\n\nThe current definition of CNV is also operational and can be modified with the advance of scanning resolution and coverage, and availability of allele frequency in a determined population.The identification of CNVs using differ-ent platforms.\n\n\n\nVarious scanning platforms and quality control methods have been used to identify CNV calls.\n\n\n\nBecause the choice of platforms has a great effect on the results, it is worth reviewing the characteristics of platforms to improve the understanding of CNVs.\n\n\n\nThe presence of CNVs in normal individuals was reported for the first time in 2004 independently by two groups led by Lee C. and Wigler M. (Iafrate et al., 2004; Sebat et al., 2004).\n\n\n\nBoth studies used two-dye array-CGH techniques that used clones of bacterial artificial chromosomes (BAC) or oligonucleotides (representational oligonucleotide microarray analysis, or ROMA).\n\n\n\nTheyindependently reported about 250 and 80 loci as changes in copy number from 39 and 20 normal individuals, respectively.\n\n\n\nFig.\n\n\n\n2 illustrates the general concept of CNV detection based on two-dye array-CGH.\n\n\n\nAlthough the average numbers of CNVs per individual genome were similar in two studies (about 12 CNVs per genome), it should be noted that there was little overlap between the results.\n\n\n\nThis discrepancy between studies was possibly due to the use of different platforms and experimental conditions in different populations.\n\n\n\nHowever, it is also probable that there are still large numbers of structural variants that have yet to be discovered (Buckley et al., 2005; Eichler, 2006).\n\n\n\nOne following study that provided evidence on the widespread presence of large-scale structural variations in the human genome was based solely on in silico analysis (Tuzun et al., 2005).\n\n\n\nThe sequence-level comparison of two independent genome sequences, i.e., one derived from a human genome reference assembly and the other from fosmid clones of a genomic library, revealed about 300 structural variations, including inversions.\n\n\n\nThis method can detect various types of structural variants, including inversion, which is not detectable by conventional array-CGH platforms.\n\n\n\nIndeed, the results by Tuzun et al.\n\n\n\n(2005) can be used as validated control for primary verification or for parameter tuning for the development of CNV-detection platforms or algorithms.\n\n\n\nAlthough the use of this method is currently limited by the unavailability of sequence data, ongoing efforts to sequence the individual human genome and to develop cost-effective sequencing platforms (Bennett et al., 2005) will be able to facilitate sequence-level genome comparisons and the identification of highly qualified structural variants in the near future.\n\n\n\nTwo studies by McCarroll et al.\n\n\n\nand Conrad et al., which focused on the identification of deletion variants (McCarroll et al., 2006; Conrad et al., 2006), used 1.2 million SNP genotyping data from The International HapMap Consortium (International HapMap Consortium.\n\n\n\n2005).\n\n\n\nThey assumed that allelic deletion causes the discard of probes in SNP genotyping.\n\n\n\nFor example, the runs of consecutive probes with null genotype calls or runs of SNP genotypes whose allelic frequencies deviate from expected Hardy-Weinberg equilibrium ratios or expected Mendelian inheritance patterns might represent the presence of deleted loci.\n\n\n\nThey independently reported about 600 potential deletions as small as less than 100 bp.\n\n\n\nThe relatively small size of the identified variants, compared with the array-CGH method, is due to the high resolution of the platforms.\n\n\n\nThe use of an SNP-centric array platform can be used to identify linkage disequilibrium (LD) of structural variants with nearby SNPs in a given population.\n\n\n\nBut, the discrepancy in deletions that were identified in the two studies was also noted in spite of using similar HapMap populations and identification methods (Eichler 2006).\n\n\n\nRecently, a comprehensive CNV analysis was reported based on high-resolution array platforms, Whole Genome TilePath (WGTP), which used 26,000 large insert clones, and Affymetric GeneChip Human Mapping 500K early access, which used 500,000 SNP oligonucleotides.\n\n\n\nThey identified about 1500 genomic segments as copy number variations or CNVRs (copy number variable regions) consisting of overlapping CNVs from 269 HapMap individuals (Redon et al., 2006).\n\n\n\nThe results from the two platforms are worth comparing becasuse they provide the highest currently achievable resolution and are often selected as primary platforms in many other studies.\n\n\n\nFirstly, the CNVs that are identified from BAC-based array-CGH are generally larger than those from oligonucleotide-based arrays (230 kb and 80 kb of median size, respectively).\n\n\n\nThis overestimation of CNVs by BAC-based array-CGH is due to the large insert clones that are used, which has been frequently reported (Iafrate et al., 2006).\n\n\n\nSecondly, the actual boundaries of structural variants can not be determined through BAC-based array-CGH.\n\n\n\nOn the other hand, a more accurate determination of variant boundaries can be achieved through SNP-centric oligonucleotide-based arrays that have an extensive number of oligonucleotides.\n\n\n\nThe SNP-centric platform has additional advantages of accompanying SNP genotype information as a potential variant source, combined with large structural variants and its ability to detect the presence of loss of heterozygosity (LOH) or segmental uniparental disomy (Bruce et al., 2005; Mei et al., 2000).\n\n\n\nBut, the SNP-centric platform also has its disadvantages.\n\n\n\nIn spite of the advanced resolution, the relatively low signal-to-noise ratio of oligonucleotide-based hybridization intensity, compared with large insert clone array, might result in higher false-positive rates.\n\n\n\nBecause most CNVs are subtle changes, this makes the results prone to misclassification of signal intensities and, consequently, to statistical errors.\n\n\n\nSometimes, it is pointed out that the SNP-centric array was originally designed for allelic discrimination and is not appropriate for CNV detection because of biased genomic distribution and sequence composition of spotted probes (McCarroll and Altshuler 2007d).\n\n\n\nRecently proposed oligonucleotide-based array platforms have been designed for CNV detection specifically without sacrificing the advantage of high resolution, which can be a promising solution for CNV detection in the near future (Barrett et al., 2004).\n\n\n\nIn identifying CNVs in normal populations, one of the fundamental problems is the lack of a reference genome from which diploid states of sample DNA can be inferred.\n\n\n\nUnlike the array-CGH-based tumor study in which the normal DNA of the same individual can be used as a reference genome, no single DNA source can present the standardized and universal genome in variant analysis.\n\n\n\nOften, the pooled genome of several individuals has been used to represent the average genome, while the heterogeneity of the used population might affect the copy number inference step, as shown for examples of X chromosomes.\n\n\n\nRedon et al.\n\n\n\nand Komura et al.\n\n\n\nadopted the pairwise comparison for ac-curate inference of copy number states in individual loci, which is noteworthy (Redon et al., 2006; Komura et al., 2006).\n\n\n\nIn pairwise comparison, the hybridization intensities of one sample is compared with those of all other remaining samples as one large reference, and the diploid states of loci can be more accurately inferred from the multiple comparison results.Clinical implications of CNVs and dis-ease association study.\n\n\n\nIn spite of recent technological developments of genetic polymorphism-oriented disease association studies, still little is known about the effects of genetic polymorphisms on common complex diseases.\n\n\n\nOne of the ultimate goals in exploring CNVs is to systematically assess the association between such variants and the disease.\n\n\n\nAlthough it is unlikely that all CNVs in the human genome are associated with diseases, evidence of the association of CNVs and a wide spectrum of human diseases has rapidly accumulated.\n\n\n\nTable 1 summarizes the CNVs that have been reported to be associated with diseases.\n\n\n\nCNVs can affect disease susceptibility or individual differences in responses to drugs through alteration of gene expression.\n\n\n\nStranger et al.\n\n\n\n's and Heidenblad et al's reports coherently showed positive correlations between DNA copy number dosage and gene expression level (Stranger et al., 2007; Heidenblad et al., 2005).\n\n\n\nIf a CNV region contains transcriptional regulatory elements rather than protein coding genes, it still can affect gene expression levels by changing transcriptional regulation or heterochromatin spread (Reymond et al., 2007).Conclusion.\n\n\n\nThe genomic fraction that is occupied by CNVs is now estimated to be about 600 Mb, already exceeding that of single base-level variants.\n\n\n\nIt is likely that the number of CNVs and the genomic fraction that is affected by structural variants will continue to expand, and many of them will be used for more practical purposes, including disease association or population studies.\n\n\n\nHowever, it should be remembered that the current CNV entries are plagued by substantial amounts of false-positive and false-negative results.\n\n\n\nOnly a small portion of them have been validated by independent methods.\n\n\n\nTo overcome this, it is necessary to improve scanning platforms, including optimizing experimental conditions and developing more reliable CNV calling algorithms.\n\n\n\nIn the meantime, it is required for individual researchers to know the characteristics of the available platforms and analytical techniques to use them or to interpret the published results properly.e found peak evidence of linkage (LOD score=1.88) for HDL cholesterol level on chromosome 6 (nearest marker D6S1660) and potential evidences for linkage on chromosomes 1, 12 and 19 with the LOD scores of 1.32, 1.44 and 1.14, respectively.\n\n\n\nThese results should pave the way for the discovery of the relevant genes by fine mapping and association analysis.IIntroduction.\n\n\n\nCholesterol is a major part of cell membranes.\n\n\n\nCholesterol is carried in the blood by chylomicrons, very low density lipoproteins (VLDL), high density lipoproteins (HDL) and low density lipoproteins (LDL) (Dastani et al.\n\n\n\n2006).\n\n\n\nHDL cholesterol is reversely associated with cardiovascular disease, and is more tightly controlled by genetic factors than the other lipoproteins such as LDL, VLDL and chylomicrons.\n\n\n\nEnvironmental factors including chronic alcoholism, estrogen replacement therapy, and exercise influence the levels of HDL cholesterol.\n\n\n\nSeveral families with strikingly elevated HDL cholesterol levels have been identified.\n\n\n\nHDL cholesterol levels are higher in blacks compared with whites and HDL cholesterol levels of females are higher than those of males (Barcat et al.\n\n\n\n2006; Brousseau et al.\n\n\n\n2004; Yamashita et al.\n\n\n\n2000; Imperatore et al.\n\n\n\n2000).\n\n\n\nCandidate gene analysis using population-based case-control studies has been used to test the association between SNPs and HDL cholesterol levels.\n\n\n\nAmong the candidate genes selected mainly from lipid metabolism pathways, ApoA-I gene is the one most intensively studied (Inazu et al.\n\n\n\n1994; Kuivenhoven et al.\n\n\n\n1997).\n\n\n\nBy genome-wide linkage analysis, susceptibility genes can be identified although the genes are not candidates based on lipid metabolism.\n\n\n\nGenome-wide linkage scans are conducted by use of microsatellite markers to identify genetic determinants affecting the traits (Wang and Paigen 2005).\n\n\n\nUsing HDL cholesterol levels as either discrete or quantitative trait, several linkage studies on genetic determinants of HDL cholesterol have been reported (Yancey et al.\n\n\n\n2003).\n\n\n\nGenetic effects on the variations in HDL cholesterol were studied mainly in Caucasians and Africans thus far, and little attention has been focused in this regard on Asian populations.\n\n\n\nWe found suggestive evidence for linkage for HDL cholesterol on chromosome 6, 1, 12 and 19, in studies conducted as part of GENDISCAN study, a large epidemiological study of Complex traits in geographically, culturally and genetically isolated large Mongolian families l in Dornod, Mongolia report.\n\n\n\nMethods.\n\n\n\nWe analyzed data from 1002 Mongolian individuals from 95 large extended families.\n\n\n\nInformed consent was obtained from all subjects prior to participation and the protocol was approved by the Institutional Review Board at Seoul National University.\n\n\n\nPotentially confounding variables were assessed for each participant along with overall medical history.\n\n\n\nInformation on age, gender and anthropometry (height, weight, waist circumference, hip circumference and body fat content) were obtained for each individual.\n\n\n\nHeight in centimeter (cm) and weight in kilograms (kg) were measured using an automatic measuring instrument (IMI 1000, Immanuel Elec., Korea).\n\n\n\nBody mass index (BMI) was calculated in kg/m.\n\n\n\nWaist circumference was measured to the nearest centimeter at the level of the umbilicus, and hip circumference was measured at the level of the maximal circumference of the gluteus.\n\n\n\nAll other variables were collected through interviews performed by trained interviewers.\n\n\n\nInformation about amount of alcohol and smoking was also obtained from all the participants.\n\n\n\nAll the subjects were asked to fast for 12 hours before their visit.\n\n\n\nBlood samples were collected from an antecubital vein into vacutainer tubes containing EDTA.\n\n\n\nBlood samples were centrifuged at 3000rpm for 10 minutes and then stored at 70C.\n\n\n\nDNA was isolated from lymphocytes for polymerase chain reaction (PCR) and automated genotyping.\n\n\n\nA 10 ml blood sample was collected from each participating individual for genomic DNA extraction.\n\n\n\nDNA was extracted from peripheral lymphocytes using the PUREGENE DNA Purification Kit for whole blood (Gentra Systems Inc, USA).\n\n\n\nFor genotyping, a set of 1000 microsatellite markers deCODE mapping sets (deCODE genetics, USA) was used covering the genome at an average density of 3 centimorgans (cM).\n\n\n\nHDL cholesterol was measured by the enzymatic method using Cholestest-N-HDL kit (DAICHI, JAPAN) and HITACHI 7600-210 \u0026 HITACHI 7180 instruments.\n\n\n\nExtensive quality control procedures ensured the validity and reproducibility of the measurements.\n\n\n\nMultiple linear regression analysis was used by PC SAS version 8.2 and PC SPSS version 12 to account for effect of confounding variables.\n\n\n\nPedigree data was managed by PedSys (Southwest Foundation for Biomedical Research, San Antonio, Texas, USA).\n\n\n\nNonpaternity was examined using PEDCHECK (Mcpeek and Sun 2000) and relationships other than paternity were checked using average IBD-based method by PREST.\n\n\n\nAfter correcting pedigree error and Mendelian errors, non-mendelian errors were examined and corrected using SimWalk.\n\n\n\nIdentity by descent (IBD) matrix between every relationship pairs in family was calculated and IBD matrix for single marker was calculated by SOLAR (Sequential Oligogenic Linkage Analysis Routines software version 2.1.4).\n\n\n\nMultipoint IBD matrices were computed on every 1 cM distance using Markov chain Monte Carlo method by LOKI (Heath 1997).\n\n\n\nGenetic components of selected phenotypes were estimated in terms of heritability.\n\n\n\nNarrow sense heritability, defined as the proportion of total phenotypic variation due to additive genetic effects, was calculated.\n\n\n\nHeritability of HDL cholesterol adjusted for age, gender, age- square, product of age and gender, product of age- square and gender, systolic BP, smoking and alcohol was estimated and a variance component linkage analysis was carried out by SOLAR which uses maximum likelihood methods to estimate variance components for the polygenic genetic effect and random individual environmental effects.\n\n\n\nResults and Discussion.\n\n\n\nThe mean age of the 1002 individuals was 31 years and 54.5% of them were female.\n\n\n\nDemographic and pedigree characteristics of the study sample are shown in Table 1.\n\n\n\nThe family size had a mean of 16.\n\n\n\nTable 2 included information on 2546 pairs of first degree relatives (1812 parent-offspring pairs and 734 full-sib pairs), 2485 pairs of their second degree relatives (395 half-sibling pairs, 1202 grandparent-grandchild pairs, and 888 avuncular pairs), and 598 first-cousin pairs.\n\n\n\nMeans of their total cholesterol, HDL cholesterol, LDL cholesterol, and triglyceride were 159.82 mg/dl, 55.19 mg/dl, 90.51 mg/dl, and 63.30 mg/dl, respectively.\n\n\n\nTable 3 shows correlation between HDL cholesterol and covariates such as age, gender, systolic blood pressure, alcohol consumption status, and smoking status.\n\n\n\nThese parameters were used as covariates in the variance component analysis which provided multivariable adjusted heritability estimates for HDL cholesterol of 0.45 (Table 4).\n\n\n\nThe peak multipoint LOD score was 1.88 on 6p21 (nearest marker D6S1660) and a secondary peak (LOD score of 1.44) was found on 12q23 (nearest marker D12S354).\n\n\n\nWe identified other potential evidence for linkage in the LOD score of 1.32 on 1q24 (nearest marker D1S412) and a LOD score of 1.14 at 19p13 (nearest marker D19S884) (Fig.\n\n\n\n1, 2).\n\n\n\nTable 5 presents all LOD scores 1.0 for HDL cholesterol.\n\n\n\nWe identified potential evidence of linkage on several chromosomes.\n\n\n\nIn other genome scan, a weak linkage signal for HDL cholesterol was observed for regions that overlapped slightly with the regions identified herein.\n\n\n\nKlos et al.\n\n\n\nreported the appearance of peak position in the chromosome 12q in European American population (Klos et al.\n\n\n\n2001) (Table 6).\n\n\n\nWe found evidence of link-\n\n"}