BMC Bioinformatics,2014年
Yunlong Liu, A Keith Dunker, William Yang, Kenji Yoshigoe, Hamid R Arabnia, Dong Xu, Zhongxue Chen, Liangjiang Wang, Jun S Liu, Andrzej Niemierko, Jack Y Yang, Weida Tong, Xiang Qin, Mary Qu Yang, Youping Deng
LicenseType:CC BY |
BackgroundKidney Renal Clear Cell Carcinoma (KIRC) is one of fatal genitourinary diseases and accounts for most malignant kidney tumours. KIRC has been shown resistance to radiotherapy and chemotherapy. Like many types of cancers, there is no curative treatment for metastatic KIRC. Using advanced sequencing technologies, The Cancer Genome Atlas (TCGA) project of NIH/NCI-NHGRI has produced large-scale sequencing data, which provide unprecedented opportunities to reveal new molecular mechanisms of cancer. We combined differentially expressed genes, pathways and network analyses to gain new insights into the underlying molecular mechanisms of the disease development.ResultsFollowed by the experimental design for obtaining significant genes and pathways, comprehensive analysis of 537 KIRC patients' sequencing data provided by TCGA was performed. Differentially expressed genes were obtained from the RNA-Seq data. Pathway and network analyses were performed. We identified 186 differentially expressed genes with significant p-value and large fold changes (P < 0.01, |log(FC)| > 5). The study not only confirmed a number of identified differentially expressed genes in literature reports, but also provided new findings. We performed hierarchical clustering analysis utilizing the whole genome-wide gene expressions and differentially expressed genes that were identified in this study. We revealed distinct groups of differentially expressed genes that can aid to the identification of subtypes of the cancer. The hierarchical clustering analysis based on gene expression profile and differentially expressed genes suggested four subtypes of the cancer. We found enriched distinct Gene Ontology (GO) terms associated with these groups of genes. Based on these findings, we built a support vector machine based supervised-learning classifier to predict unknown samples, and the classifier achieved high accuracy and robust classification results. In addition, we identified a number of pathways (P < 0.04) that were significantly influenced by the disease. We found that some of the identified pathways have been implicated in cancers from literatures, while others have not been reported in the cancer before. The network analysis leads to the identification of significantly disrupted pathways and associated genes involved in the disease development. Furthermore, this study can provide a viable alternative in identifying effective drug targets.ConclusionsOur study identified a set of differentially expressed genes and pathways in kidney renal clear cell carcinoma, and represents a comprehensive computational approach to analysis large-scale next-generation sequencing data. The pathway and network analyses suggested that information from distinctly expressed genes can be utilized in the identification of aberrant upstream regulators. Identification of distinctly expressed genes and altered pathways are important in effective biomarker identification for early cancer diagnosis and treatment planning. Combining differentially expressed genes with pathway and network analyses using intelligent computational approaches provide an unprecedented opportunity to identify upstream disease causal genes and effective drug targets.
BMC Bioinformatics,2014年
A Keith Dunker, William Yang, Hamid R Arabnia, Youping Deng, Yunlong Liu, Zuojie Luo, Zhongxue Chen, Liangjiang Wang, Jun S Liu, Andrzej Niemierko, Weida Tong, Xiang Qin, Jack Y Yang, Mary Qu Yang, Dong Xu
LicenseType:Unknown |
Advances of high-throughput technologies have rapidly produced more and more data from DNAs and RNAs to proteins, especially large volumes of genome-scale data. However, connection of the genomic information to cellular functions and biological behaviours relies on the development of effective approaches at higher systems level. In particular, advances in RNA-Seq technology has helped the studies of transcriptome, RNA expressed from the genome, while systems biology on the other hand provides more comprehensive pictures, from which genes and proteins actively interact to lead to cellular behaviours and physiological phenotypes. As biological interactions mediate many biological processes that are essential for cellular function or disease development, it is important to systematically identify genomic information including genetic mutations from GWAS (genome-wide association study), differentially expressed genes, bidirectional promoters, intrinsic disordered proteins (IDP) and protein interactions to gain deep insights into the underlying mechanisms of gene regulations and networks. Furthermore, bidirectional promoters can co-regulate many biological pathways, where the roles of bidirectional promoters can be studied systematically for identifying co-regulating genes at interactive network level. Combining information from different but related studies can ultimately help revealing the landscape of molecular mechanisms underlying complex diseases such as cancer.
BMC Bioinformatics,2014年
William Yang, Jack Y Yang, Haihua Wu, Mary Qu Yang, Xinyu Yang, Yan Li, Shandan Xu, Lili Lu, Chang Liu, Xiaolei Song, Quan Kong, Youping Deng
LicenseType:Unknown |
BackgroundDiabetes mellitus of type 2 (T2D), also known as noninsulin-dependent diabetes mellitus (NIDDM) or adult-onset diabetes, is a common disease. It is estimated that more than 300 million people worldwide suffer from T2D. In this study, we investigated the T2D, pre-diabetic and healthy human (no diabetes) bloodstream samples using genomic, genealogical, and phonemic information. We identified differentially expressed genes and pathways. The study has provided deeper insights into the development of T2D, and provided useful information for further effective prevention and treatment of the disease.ResultsA total of 142 bloodstream samples were collected, including 47 healthy humans, 22 pre-diabetic and 73 T2D patients. Whole genome scale gene expression profiles were obtained using the Agilent Oligo chips that contain over 20,000 human genes. We identified 79 significantly differentially expressed genes that have fold change ≥ 2. We mapped those genes and pinpointed locations of those genes on human chromosomes. Amongst them, 3 genes were not mapped well on the human genome, but the rest of 76 differentially expressed genes were well mapped on the human genome. We found that most abundant differentially expressed genes are on chromosome one, which contains 9 of those genes, followed by chromosome two that contains 7 of the 76 differentially expressed genes. We performed gene ontology (GO) functional analysis of those 79 differentially expressed genes and found that genes involve in the regulation of cell proliferation were among most common pathways related to T2D. The expression of the 79 genes was combined with clinical information that includes age, sex, and race to construct an optimal discriminant model. The overall performance of the model reached 95.1% accuracy, with 91.5% accuracy on identifying healthy humans, 100% accuracy on pre-diabetic patients and 95.9% accuract on T2D patients. The higher performance on identifying pre-diabetic patients was resulted from more significant changes of gene expressions among this particular group of humans, which implicated that patients were having profound genetic changes towards disease development.ConclusionDifferentially expressed genes were distributed across chromosomes, and are more abundant on chromosomes 1 and 2 than the rest of the human genome. We found that regulation of cell proliferation actually plays an important role in the T2D disease development. The predictive model developed in this study has utilized the 79 significant genes in combination with age, sex, and racial information to distinguish pre-diabetic, T2D, and healthy humans. The study not only has provided deeper understanding of the disease molecular mechanisms but also useful information for pathway analysis and effective drug target identification.
4 Gene regulation mediated by microRNAs in response to green tea polyphenol EGCG in mouse lung cancer [期刊论文]
BMC Genomics,2014年
Hong Zhou, Mary Qu Yang, Youping Deng, Jayson X Chen, Chung S Yang, Hong Wang
LicenseType:Unknown |
BackgroundEpigallocatechin-3-gallate (EGCG) has been demonstrated to inhibit cancer in experimental studies through its antioxidant activity and modulations on cellular functions by binding specific proteins. We demonstrated previously that EGCG upregulates the expression of microRNA (i.e. miR-210) by binding HIF-1α, resulting in reduced cell proliferation and anchorage-independent growth. However, the binding affinities of EGCG to HIF-1α and many other targets are higher than the EGCG plasma peak level in experimental animals administered with high dose of EGCG, raising a concern whether the microRNA regulation by HIF-1α is involved in the anti-cancer activity of EGCG in vivo.ResultsWe employed functional genomic approaches to elucidate the role of microRNA in the EGCG inhibition of tobacco carcinogen-induced lung tumors in A/J mice. By analysing the microRNA profiles, we found modest changes in the expression levels of 21 microRNAs. By correlating these 21 microRNAs with the mRNA expression profiles using the computation methods, we identified 26 potential targeted genes of the 21 microRNAs. Further exploration using pathway analysis revealed that the most impacted pathways of EGCG treatment are the regulatory networks associated to AKT, NF-κB, MAP kinases, and cell cycle, and the identified miRNA targets are involved in the networks of AKT, MAP kinases and cell cycle regulationConclusionsThese results demonstrate that the miRNA-mediated regulation is actively involved in the major aspects of the anti-cancer activity of EGCG in vivo.
BMC Bioinformatics,2014年
William Yang, Jack Y Yang, Haihua Wu, Mary Qu Yang, Xinyu Yang, Yan Li, Shandan Xu, Lili Lu, Chang Liu, Xiaolei Song, Quan Kong, Youping Deng
LicenseType:Unknown |
BackgroundDiabetes mellitus of type 2 (T2D), also known as noninsulin-dependent diabetes mellitus (NIDDM) or adult-onset diabetes, is a common disease. It is estimated that more than 300 million people worldwide suffer from T2D. In this study, we investigated the T2D, pre-diabetic and healthy human (no diabetes) bloodstream samples using genomic, genealogical, and phonemic information. We identified differentially expressed genes and pathways. The study has provided deeper insights into the development of T2D, and provided useful information for further effective prevention and treatment of the disease.ResultsA total of 142 bloodstream samples were collected, including 47 healthy humans, 22 pre-diabetic and 73 T2D patients. Whole genome scale gene expression profiles were obtained using the Agilent Oligo chips that contain over 20,000 human genes. We identified 79 significantly differentially expressed genes that have fold change ≥ 2. We mapped those genes and pinpointed locations of those genes on human chromosomes. Amongst them, 3 genes were not mapped well on the human genome, but the rest of 76 differentially expressed genes were well mapped on the human genome. We found that most abundant differentially expressed genes are on chromosome one, which contains 9 of those genes, followed by chromosome two that contains 7 of the 76 differentially expressed genes. We performed gene ontology (GO) functional analysis of those 79 differentially expressed genes and found that genes involve in the regulation of cell proliferation were among most common pathways related to T2D. The expression of the 79 genes was combined with clinical information that includes age, sex, and race to construct an optimal discriminant model. The overall performance of the model reached 95.1% accuracy, with 91.5% accuracy on identifying healthy humans, 100% accuracy on pre-diabetic patients and 95.9% accuract on T2D patients. The higher performance on identifying pre-diabetic patients was resulted from more significant changes of gene expressions among this particular group of humans, which implicated that patients were having profound genetic changes towards disease development.ConclusionDifferentially expressed genes were distributed across chromosomes, and are more abundant on chromosomes 1 and 2 than the rest of the human genome. We found that regulation of cell proliferation actually plays an important role in the T2D disease development. The predictive model developed in this study has utilized the 79 significant genes in combination with age, sex, and racial information to distinguish pre-diabetic, T2D, and healthy humans. The study not only has provided deeper understanding of the disease molecular mechanisms but also useful information for pathway analysis and effective drug target identification.
BMC Bioinformatics,2014年
Yunlong Liu, A Keith Dunker, William Yang, Kenji Yoshigoe, Hamid R Arabnia, Dong Xu, Zhongxue Chen, Liangjiang Wang, Jun S Liu, Andrzej Niemierko, Jack Y Yang, Weida Tong, Xiang Qin, Mary Qu Yang, Youping Deng
LicenseType:CC BY |
BackgroundKidney Renal Clear Cell Carcinoma (KIRC) is one of fatal genitourinary diseases and accounts for most malignant kidney tumours. KIRC has been shown resistance to radiotherapy and chemotherapy. Like many types of cancers, there is no curative treatment for metastatic KIRC. Using advanced sequencing technologies, The Cancer Genome Atlas (TCGA) project of NIH/NCI-NHGRI has produced large-scale sequencing data, which provide unprecedented opportunities to reveal new molecular mechanisms of cancer. We combined differentially expressed genes, pathways and network analyses to gain new insights into the underlying molecular mechanisms of the disease development.ResultsFollowed by the experimental design for obtaining significant genes and pathways, comprehensive analysis of 537 KIRC patients' sequencing data provided by TCGA was performed. Differentially expressed genes were obtained from the RNA-Seq data. Pathway and network analyses were performed. We identified 186 differentially expressed genes with significant p-value and large fold changes (P < 0.01, |log(FC)| > 5). The study not only confirmed a number of identified differentially expressed genes in literature reports, but also provided new findings. We performed hierarchical clustering analysis utilizing the whole genome-wide gene expressions and differentially expressed genes that were identified in this study. We revealed distinct groups of differentially expressed genes that can aid to the identification of subtypes of the cancer. The hierarchical clustering analysis based on gene expression profile and differentially expressed genes suggested four subtypes of the cancer. We found enriched distinct Gene Ontology (GO) terms associated with these groups of genes. Based on these findings, we built a support vector machine based supervised-learning classifier to predict unknown samples, and the classifier achieved high accuracy and robust classification results. In addition, we identified a number of pathways (P < 0.04) that were significantly influenced by the disease. We found that some of the identified pathways have been implicated in cancers from literatures, while others have not been reported in the cancer before. The network analysis leads to the identification of significantly disrupted pathways and associated genes involved in the disease development. Furthermore, this study can provide a viable alternative in identifying effective drug targets.ConclusionsOur study identified a set of differentially expressed genes and pathways in kidney renal clear cell carcinoma, and represents a comprehensive computational approach to analysis large-scale next-generation sequencing data. The pathway and network analyses suggested that information from distinctly expressed genes can be utilized in the identification of aberrant upstream regulators. Identification of distinctly expressed genes and altered pathways are important in effective biomarker identification for early cancer diagnosis and treatment planning. Combining differentially expressed genes with pathway and network analyses using intelligent computational approaches provide an unprecedented opportunity to identify upstream disease causal genes and effective drug targets.