全部资源

1 A hybrid imputation approach for microarray missing value estimation [期刊论文]

BMC Genomics,2015年

Fengfeng Shao, Huihui Li, Changbo Zhao, Guo-Zheng Li, Xiao Wang

LicenseType:CC BY |

摘要
图表
参考文献

BackgroundMissing data is an inevitable phenomenon in gene expression microarray experiments due to instrument failure or human error. It has a negative impact on performance of downstream analysis. Technically, most existing approaches suffer from this prevalent problem. Imputation is one of the frequently used methods for processing missing data. Actually many developments have been achieved in the research on estimating missing values. The challenging task is how to improve imputation accuracy for data with a large missing rate.MethodsIn this paper, induced by the thought of collaborative training, we propose a novel hybrid imputation method, called Recursive Mutual Imputation (RMI). Specifically, RMI exploits global correlation information and local structure in the data, captured by two popular methods, Bayesian Principal Component Analysis (BPCA) and Local Least Squares (LLS), respectively. Mutual strategy is implemented by sharing the estimated data sequences at each recursive process. Meanwhile, we consider the imputation sequence based on the number of missing entries in the target gene. Furthermore, a weight based integrated method is utilized in the final assembling step.ResultsWe evaluate RMI with three state-of-art algorithms (BPCA, LLS, Iterated Local Least Squares imputation (ItrLLS)) on four publicly available microarray datasets. Experimental results clearly demonstrate that RMI significantly outperforms comparative methods in terms of Normalized Root Mean Square Error (NRMSE), especially for datasets with large missing rates and less complete genes.ConclusionsIt is noted that our proposed hybrid imputation approach incorporates both global and local information of microarray genes, which achieves lower NRMSE values against to any single approach only. Besides, this study highlights the need for considering the imputing sequence of missing entries for imputation methods.

连接1

2 Diversity of the cell-wall associated genomic island of the archaeon Haloquadratum walsbyi [期刊论文]

BMC Genomics,2015年

Ana-Belen Martin-Cuadrado, Francisco Rodriguez-Valera, Lejla Pašić

LicenseType:Unknown |

预览 | 原文链接 | 全文 [ 浏览：0 下载：0 ]

摘要
图表
参考文献

BackgroundHaloquadratum walsbyi represents up to 80 % of cells in NaCl-saturated brines worldwide, but is notoriously difficult to maintain under laboratory conditions. In order to establish the extent of genetic diversity in a natural population of this microbe, we screened a H. walsbyi enriched metagenomic fosmid library and recovered seven novel version of its cell-wall associated genomic island. The fosmid inserts were sequenced and analysed.ResultsThe novel cell-wall associated islands delineated two major clades within H. walsbyi. The islands predominantly contained genes putatively involved in biosynthesis of surface layer, genes encoding cell surface glycoproteins and genes involved in envelope formation. We further found that these genes are maintained in the population and that the diversity of this region arises through homologous recombination but also through the action of mobile genetic elements, including viruses.ConclusionsThe population of H. walsbyi in the studied saltern brine is composed of numerous clonal lineages that differ in surface structures including the cell wall. This type of variation probably reflects a number of mechanisms that minimize the infection rate of predating viruses.

连接1

3 Comparative genomics of Pseudomonas fluorescens subclade III strains from human lungs [期刊论文]

BMC Genomics,2015年

John J. LiPuma, Ian M. Huffnagle, John R. Erb-Downward, Gary B. Huffnagle, Brittan S. Scales

LicenseType:CC BY |

预览 | 原文链接 | 全文 [ 浏览：0 下载：0 ]

摘要
图表
参考文献

BackgroundWhile the taxonomy and genomics of environmental strains from the P. fluorescens species-complex has been reported, little is known about P. fluorescens strains from clinical samples. In this report, we provide the first genomic analysis of P. fluorescens strains in which human vs. environmental isolates are compared.ResultsSeven P. fluorescens strains were isolated from respiratory samples from cystic fibrosis (CF) patients. The clinical strains could grow at a higher temperature (>34 °C) than has been reported for environmental strains. Draft genomes were generated for all of the clinical strains, and multi-locus sequence analysis placed them within subclade III of the P. fluorescens species-complex. All strains encoded type- II, −III, −IV, and -VI secretion systems, as well as the widespread colonization island (WCI). This is the first description of a WCI in P. fluorescens strains. All strains also encoded a complete I2/PfiT locus and showed evidence of horizontal gene transfer. The clinical strains were found to differ from the environmental strains in the number of genes involved in metal resistance, which may be a possible adaptation to chronic antibiotic exposure in the CF lung.ConclusionsThis is the largest comparative genomics analysis of P. fluorescens subclade III strains to date and includes the first clinical isolates. At a global level, the clinical P. fluorescens subclade III strains were largely indistinguishable from environmental P. fluorescens subclade III strains, supporting the idea that identifying strains as ‘environmental’ vs ‘clinical’ is not a phenotypic trait. Rather, strains within P. fluorescens subclade III will colonize and persist in any niche that provides the requirements necessary for growth.

连接1

4 Co-modulation analysis of gene regulation in breast cancer reveals complex interplay between ESR1 and ERBB2 genes [期刊论文]

BMC Genomics,2015年

Yi-Pin Lai, Chuhsing Kate Hsiao, Tzu-Hung Hsiao, Chin-Ting Wu, Eric Y Chuang, Yu-Chiao Chiu, Yidong Chen

LicenseType:Unknown |

预览 | 原文链接 | 全文 [ 浏览：0 下载：0 ]

摘要
图表
参考文献

BackgroundGene regulation is dynamic across cellular conditions and disease subtypes. From the aspect of regulation under modulation, regulation strength between a pair of genes can be modulated by (dependent on) expression abundance of another gene (modulator gene). Previous studies have demonstrated the involvement of genes modulated by single modulator genes in cancers, including breast cancer. However, analysis of multi-modulator co-modulation that can further delineate the landscape of complex gene regulation is, to our knowledge, unexplored previously. In the present study we aim to explore the joint effects of multiple modulator genes in modulating global gene regulation and dissect the biological functions in breast cancer.ResultsTo carry out the analysis, we proposed the Covariability-based Multiple Regression (CoMRe) method. The method is mainly built on a multiple regression model that takes expression levels of multiple modulators as inputs and regulation strength between genes as output. Pairs of genes were divided into groups based on their co-modulation patterns. Analyzing gene expression profiles from 286 breast cancer patients, CoMRe investigated ten candidate modulator genes that interacted and jointly determined global gene regulation. Among the candidate modulators, ESR1, ERBB2, and ADAM12 were found modulating the most numbers of gene pairs. The largest group of gene pairs was composed of ones that were modulated by merely ESR1. Functional annotation revealed that the group was significantly related to tumorigenesis and estrogen signaling in breast cancer. ESR1−ERBB2 co-modulation was the largest group modulated by more than one modulators. Similarly, the group was functionally associated with hormone stimulus, suggesting that functions of the two modulators are performed, at least partially, through modulation. The findings were validated in majorities of patients (> 99%) of two independent breast cancer datasets.ConclusionsWe have showed CoMRe is a robust method to discover critical modulators in gene regulatory networks, and it is capable of achieving reproducible and biologically meaningful results. Our data reveal that gene regulatory networks modulated by single modulator or co-modulated by multiple modulators play important roles in breast cancer. Findings of this report illuminate complex and dynamic gene regulation under modulation and its involvement in breast cancer.

连接1

5 Biofilm-associated proteins: news from Acinetobacter [期刊论文]

BMC Genomics,2015年

Emanuela Roscetto, Marianna Martinucci, Pier Paolo Di Nocera, Eliana De Gregorio, Mariateresa Del Franco, Raffaele Zarrilli

LicenseType:CC BY |

预览 | 原文链接 | 全文 [ 浏览：0 下载：0 ]

摘要
图表
参考文献

BackgroundA giant protein called BAP (biofilm-associated protein) plays a role in biofilm formation and adhesion to host cells in A. baumannii. Most of the protein is made by arrays of 80–110 aa modules featuring immunoglobulin-like (Ig-like) motifs.ResultsThe survey of 541 A. baumannii sequenced strains belonging to 108 STs (sequence types) revealed that BAP is highly polymorphic, distinguishable in three main types for changes both in the repetitive and the COOH region. Analyzing the different STs, we found that 29 % feature type-1, 40 % type-2 BAP, 11 % type-3 BAP, 20 % lack BAP. The type-3 variant is restricted to A. baumannii, type-1 and type-2 BAP have been identified also in other species of the Acinetobacter calcoaceticus-baumannii (ACB) complex. A. calcoaceticus and A. pittii also encode BAP-like proteins in which Ig-like repeats are replaced by long tracts of alternating serine and aspartic acid residues. We have identified in species of the ACB complex two additional proteins, BLP1 and BLP2 (BAP-like proteins 1 and 2) which feature Ig-like repeats, share with BAP a sequence motif at the NH2 terminus, and are similarly expressed in stationary growth phase. The knock-out of either BLP1 or BLP2 genes of the A. baumannii ST1 AYE strain severely affected biofilm formation, as measured by comparing biofilm biomass and thickness, and adherence to epithelial cells. BLP1 is missing in the majority of type-3 BAP strains. BLP2 is largely conserved, but is frequently missing in BAP-negative cells.ConclusionsMultiple proteins sharing Ig-like repeats seem to be involved in biofilm formation. The uneven distribution of the different BAP types, BLP1, and BLP2 is highly indicative that alternative protein complexes involved in biofilm formation are assembled in different A. baumannii strains.

连接1

6 A maximum pseudo-likelihood approach for phylogenetic networks [期刊论文]

BMC Genomics,2015年

Yun Yu, Luay Nakhleh

LicenseType:CC BY |

预览 | 原文链接 | 全文 [ 浏览：0 下载：0 ]

摘要
图表
参考文献

BackgroundSeveral phylogenomic analyses have recently demonstrated the need to account simultaneously for incomplete lineage sorting (ILS) and hybridization when inferring a species phylogeny. A maximum likelihood approach was introduced recently for inferring species phylogenies in the presence of both processes, and showed very good results. However, computing the likelihood of a model in this case is computationally infeasible except for very small data sets.ResultsInspired by recent work on the pseudo-likelihood of species trees based on rooted triples, we introduce the pseudo-likelihood of a phylogenetic network, which, when combined with a search heuristic, provides a statistical method for phylogenetic network inference in the presence of ILS. Unlike trees, networks are not always uniquely encoded by a set of rooted triples. Therefore, even when given sufficient data, the method might converge to a network that is equivalent under rooted triples to the true one, but not the true one itself. The method is computationally efficient and has produced very good results on the data sets we analyzed. The method is implemented in PhyloNet, which is publicly available in open source.ConclusionsMaximum pseudo-likelihood allows for inferring species phylogenies in the presence of hybridization and ILS, while scaling to much larger data sets than is currently feasible under full maximum likelihood. The nonuniqueness of phylogenetic networks encoded by a system of rooted triples notwithstanding, the proposed method infers the correct network under certain scenarios, and provides candidates for further exploration under other criteria and/or data in other scenarios.

连接1