全部资源

1 Investigating MicroRNA and transcription factor co-regulatory networks in colorectal cancer [期刊论文]

BMC Bioinformatics,2017年

Jing Wang, Qi Liu, Qingling Zhang, Yanqing Ding, Jiamao Luo, Huilin Niu, Chun Liu, Hao Wang, Hua Xu, Jingchun Sun, Zhongming Zhao

LicenseType:CC BY |

预览 | 原文链接 | 全文 [ 浏览：0 下载：0 ]

摘要
图表
参考文献

BackgroundColorectal cancer (CRC) is one of the most common malignancies worldwide with poor prognosis. Studies have showed that abnormal microRNA (miRNA) expression can affect CRC pathogenesis and development through targeting critical genes in cellular system. However, it is unclear about which miRNAs play central roles in CRC’s pathogenesis and how they interact with transcription factors (TFs) to regulate the cancer-related genes.ResultsTo address this issue, we systematically explored the major regulation motifs, namely feed-forward loops (FFLs), that consist of miRNAs, TFs and CRC-related genes through the construction of a miRNA-TF regulatory network in CRC. First, we compiled CRC-related miRNAs, CRC-related genes, and human TFs from multiple data sources. Second, we identified 13,123 3-node FFLs including 25 miRNA-FFLs, 13,005 TF-FFLs and 93 composite-FFLs, and merged the 3-node FFLs to construct a CRC-related regulatory network. The network consists of three types of regulatory subnetworks (SNWs): miRNA-SNW, TF-SNW, and composite-SNW. To enhance the accuracy of the network, the results were filtered by using The Cancer Genome Atlas (TCGA) expression data in CRC, whereby we generated a core regulatory network consisting of 58 significant FFLs. We then applied a hub identification strategy to the significant FFLs and found 5 significant components, including two miRNAs (hsa-miR-25 and hsa-miR-31), two genes (ADAMTSL3 and AXIN1) and one TF (BRCA1). The follow up prognosis analysis indicated all of the 5 significant components having good prediction of overall survival of CRC patients.ConclusionsIn summary, we generated a CRC-specific miRNA-TF regulatory network, which is helpful to understand the complex CRC regulatory mechanisms and guide clinical treatment. The discovered 5 regulators might have critical roles in CRC pathogenesis and warrant future investigation.

连接1

2 PhyloMap: an algorithm for visualizing relationships of large sequence data sets and its application to the influenza A virus genome [期刊论文]

BMC Bioinformatics,2011年

Thomas Martinetz, Amir Madany Mamlouk, Jiajie Zhang, Rolf Hilgenfeld, Suhua Chang, Jing Wang

LicenseType:Unknown |

预览 | 原文链接 | 全文 [ 浏览：0 下载：0 ]

摘要
图表
参考文献

BackgroundResults of phylogenetic analysis are often visualized as phylogenetic trees. Such a tree can typically only include up to a few hundred sequences. When more than a few thousand sequences are to be included, analyzing the phylogenetic relationships among them becomes a challenging task. The recent frequent outbreaks of influenza A viruses have resulted in the rapid accumulation of corresponding genome sequences. Currently, there are more than 7500 influenza A virus genomes in the database. There are no efficient ways of representing this huge data set as a whole, thus preventing a further understanding of the diversity of the influenza A virus genome.ResultsHere we present a new algorithm, "PhyloMap", which combines ordination, vector quantization, and phylogenetic tree construction to give an elegant representation of a large sequence data set. The use of PhyloMap on influenza A virus genome sequences reveals the phylogenetic relationships of the internal genes that cannot be seen when only a subset of sequences are analyzed.ConclusionsThe application of PhyloMap to influenza A virus genome data shows that it is a robust algorithm for analyzing large sequence data sets. It utilizes the entire data set, minimizes bias, and provides intuitive visualization. PhyloMap is implemented in JAVA, and the source code is freely available at http://www.biochem.uni-luebeck.de/public/software/phylomap.html

连接1

3 Extracting consistent knowledge from highly inconsistent cancer gene data sources [期刊论文]

BMC Bioinformatics,2010年

Jing Wang, Lin Zhang, Jing Zhu, Yuannv Zhang, Wenyuan Zhao, Xue Gong, Lixin Cheng, Yunyan Gu, Ruihong Wu, Zheng Guo

LicenseType:Unknown |

预览 | 原文链接 | 全文 [ 浏览：0 下载：0 ]

摘要
图表
参考文献

BackgroundHundreds of genes that are causally implicated in oncogenesis have been found and collected in various databases. For efficient application of these abundant but diverse data sources, it is of fundamental importance to evaluate their consistency.ResultsFirst, we showed that the lists of cancer genes from some major data sources were highly inconsistent in terms of overlapping genes. In particular, most cancer genes accumulated in previous small-scale studies could not be rediscovered in current high-throughput genome screening studies. Then, based on a metric proposed in this study, we showed that most cancer gene lists from different data sources were highly functionally consistent. Finally, we extracted functionally consistent cancer genes from various data sources and collected them in our database F-Census.ConclusionsAlthough they have very low gene overlapping, most cancer gene data sources are highly consistent at the functional level, which indicates that they can separately capture partial genes in a few key pathways associated with cancer. Our results suggest that the sample sizes currently used for cancer studies might be inadequate for consistently capturing individual cancer genes, but could be sufficient for finding a number of cancer genes that could represent functionally most cancer genes. The F-Census database provides biologists with a useful tool for browsing and extracting functionally consistent cancer genes from various data sources.

连接1

4 Evaluation and integration of existing methods for computational prediction of allergens [期刊论文]

BMC Bioinformatics,2013年

Dabing Zhang, Jing Wang, Jing Li, Yunan Zhao, Yabin Yu

LicenseType:Unknown |

预览 | 原文链接 | 全文 [ 浏览：0 下载：0 ]

摘要
图表
参考文献

BackgroundAllergy involves a series of complex reactions and factors that contribute to the development of the disease and triggering of the symptoms, including rhinitis, asthma, atopic eczema, skin sensitivity, even acute and fatal anaphylactic shock. Prediction and evaluation of the potential allergenicity is of importance for safety evaluation of foods and other environment factors. Although several computational approaches for assessing the potential allergenicity of proteins have been developed, their performance and relative merits and shortcomings have not been compared systematically.ResultsTo evaluate and improve the existing methods for allergen prediction, we collected an up-to-date definitive dataset consisting of 989 known allergens and massive putative non-allergens. The three most widely used allergen computational prediction approaches including sequence-, motif- and SVM-based (Support Vector Machine) methods were systematically compared using the defined parameters and we found that SVM-based method outperformed the other two methods with higher accuracy and specificity. The sequence-based method with the criteria defined by FAO/WHO (FAO: Food and Agriculture Organization of the United Nations; WHO: World Health Organization) has higher sensitivity of over 98%, but having a low specificity. The advantage of motif-based method is the ability to visualize the key motif within the allergen. Notably, the performances of the sequence-based method defined by FAO/WHO and motif eliciting strategy could be improved by the optimization of parameters. To facilitate the allergen prediction, we integrated these three methods in a web-based application proAP, which provides the global search of the known allergens and a powerful tool for allergen predication. Flexible parameter setting and batch prediction were also implemented. The proAP can be accessed at http://gmobl.sjtu.edu.cn/proAP/main.html.ConclusionsThis study comprehensively evaluated sequence-, motif- and SVM-based computational prediction approaches for allergens and optimized their parameters to obtain better performance. These findings may provide helpful guidance for the researchers in allergen-prediction. Furthermore, we integrated these methods into a web application proAP, greatly facilitating users to do customizable allergen search and prediction.

连接1

5 PhyloMap: an algorithm for visualizing relationships of large sequence data sets and its application to the influenza A virus genome [期刊论文]

BMC Bioinformatics,2011年

Thomas Martinetz, Amir Madany Mamlouk, Jiajie Zhang, Rolf Hilgenfeld, Suhua Chang, Jing Wang

LicenseType:Unknown |

预览 | 原文链接 | 全文 [ 浏览：0 下载：0 ]

摘要
图表
参考文献

BackgroundResults of phylogenetic analysis are often visualized as phylogenetic trees. Such a tree can typically only include up to a few hundred sequences. When more than a few thousand sequences are to be included, analyzing the phylogenetic relationships among them becomes a challenging task. The recent frequent outbreaks of influenza A viruses have resulted in the rapid accumulation of corresponding genome sequences. Currently, there are more than 7500 influenza A virus genomes in the database. There are no efficient ways of representing this huge data set as a whole, thus preventing a further understanding of the diversity of the influenza A virus genome.ResultsHere we present a new algorithm, "PhyloMap", which combines ordination, vector quantization, and phylogenetic tree construction to give an elegant representation of a large sequence data set. The use of PhyloMap on influenza A virus genome sequences reveals the phylogenetic relationships of the internal genes that cannot be seen when only a subset of sequences are analyzed.ConclusionsThe application of PhyloMap to influenza A virus genome data shows that it is a robust algorithm for analyzing large sequence data sets. It utilizes the entire data set, minimizes bias, and provides intuitive visualization. PhyloMap is implemented in JAVA, and the source code is freely available at http://www.biochem.uni-luebeck.de/public/software/phylomap.html

连接1

6 Extracting consistent knowledge from highly inconsistent cancer gene data sources [期刊论文]

BMC Bioinformatics,2010年

Jing Wang, Lin Zhang, Jing Zhu, Yuannv Zhang, Wenyuan Zhao, Xue Gong, Lixin Cheng, Yunyan Gu, Ruihong Wu, Zheng Guo

LicenseType:Unknown |

预览 | 原文链接 | 全文 [ 浏览：0 下载：0 ]

摘要
图表
参考文献

BackgroundHundreds of genes that are causally implicated in oncogenesis have been found and collected in various databases. For efficient application of these abundant but diverse data sources, it is of fundamental importance to evaluate their consistency.ResultsFirst, we showed that the lists of cancer genes from some major data sources were highly inconsistent in terms of overlapping genes. In particular, most cancer genes accumulated in previous small-scale studies could not be rediscovered in current high-throughput genome screening studies. Then, based on a metric proposed in this study, we showed that most cancer gene lists from different data sources were highly functionally consistent. Finally, we extracted functionally consistent cancer genes from various data sources and collected them in our database F-Census.ConclusionsAlthough they have very low gene overlapping, most cancer gene data sources are highly consistent at the functional level, which indicates that they can separately capture partial genes in a few key pathways associated with cancer. Our results suggest that the sample sizes currently used for cancer studies might be inadequate for consistently capturing individual cancer genes, but could be sufficient for finding a number of cancer genes that could represent functionally most cancer genes. The F-Census database provides biologists with a useful tool for browsing and extracting functionally consistent cancer genes from various data sources.

连接1