BMC Genomics,2017年
Kalyani V. Guntur, Vivek K. Vishnudas, Stephane Gesta, Viatcheslav R. Akmaev, Jyoti Ranjan, Niven R. Narain, Rangaprasad Sarangarajan, Robert Sebra, Jun Zhu, Jialiang Yang, Kimaada Allette, Francesca Petralia, Bin Zhang, Jacob Hagen, Zhidong Tu, Milind Mahajan, Sander Houten, Eric E. Schadt, Andrew Kasarskis, Sarah Schuyler, Carmen A. Argmann
LicenseType:CC BY |
BackgroundExosomes and other extracellular vesicles (EVs) have emerged as an important mechanism of cell-to-cell communication. However, previous studies either did not fully resolve what genetic materials were shuttled by exosomes or only focused on a specific set of miRNAs and mRNAs. A more systematic method is required to identify the genetic materials that are potentially transferred during cell-to-cell communication through EVs in an unbiased manner.ResultsIn this work, we present a novel next generation of sequencing (NGS) based approach to identify EV mediated mRNA exchanges between co-cultured adipocyte and macrophage cells. We performed molecular and genomic profiling and jointly considered data from RNA sequencing (RNA-seq) and genotyping to track the “sequence varying mRNAs” transferred between cells. We identified 8 mRNAs being transferred from macrophages to adipocytes and 21 mRNAs being transferred in the opposite direction. These mRNAs represented biological functions including extracellular matrix, cell adhesion, glycoprotein, and signal peptides.ConclusionsOur study sheds new light on EV mediated RNA communications between adipocyte and macrophage cells, which may play a significant role in developing insulin resistance in diabetic patients. This work establishes a new method that is applicable to examining genetic material exchanges in many cellular systems and has the potential to be extended to in vivo studies as well.
BMC Genomics,2017年
Marc Sultan, Anita Fernandez, Walter Carbone, Guglielmo Roma, Sven Schuierer, Judith Knehr, Virginie Petitjean
LicenseType:CC BY |
BackgroundRNA-sequencing (RNA-seq) has emerged as one of the most sensitive tool for gene expression analysis. Among the library preparation methods available, the standard poly(A) + enrichment provides a comprehensive, detailed, and accurate view of polyadenylated RNAs. However, on samples of suboptimal quality ribosomal RNA depletion and exon capture methods have recently been reported as better alternatives.MethodsWe compared for the first time three commercial Illumina library preparation kits (TruSeq Stranded mRNA, TruSeq Ribo-Zero rRNA Removal, and TruSeq RNA Access) as representatives of these three different approaches using well-established human reference RNA samples from the MAQC/SEQC consortium on a wide range of input amounts (from 100 ng down to 1 ng) and degradation levels (intact, degraded, and highly degraded).ResultsWe assessed the accuracy of the generated expression values by comparison to gold standard TaqMan qPCR measurements and gained unprecedented insight into the limits of applicability in terms of input quantity and sample quality of each protocol. We found that each protocol generates highly reproducible results (R2 > 0.92) on intact RNA samples down to input amounts of 10 ng. For degraded RNA samples, Ribo-Zero showed clear performance advantages over the other two protocols as it generated more accurate and better reproducible gene expression results even at very low input amounts such as 1 ng and 2 ng. For highly degraded RNA samples, RNA Access performed best generating reliable data down to 5 ng input.ConclusionsWe found that the ribosomal RNA depletion protocol from Illumina works very well at amounts far below recommendation and over a good range of intact and degraded material. We also infer that the exome-capture protocol (RNA Access, Illumina) performs better than other methods on highly degraded and low amount samples.
BMC Genomics,2017年
Masaru Tomita, Kazuharu Arakawa, Nobuaki Kono
LicenseType:CC BY |
BackgroundThe reduced cost of sequencing has made de novo sequencing and the assembly of draft microbial genomes feasible in any ordinary biology lab. However, the process of finishing and completing the genome remains labor-intensive and computationally challenging in some cases, such as in the study of complete genome sequences, genomic rearrangements, long-range syntenic relationships, and structural variations.MethodsHere, we show a contig reordering strategy based on experimental replication profiling (eRP) to recapitulate the bacterial genome structure within draft genomes. During the exponential growth phase, the majority of bacteria show a global genomic copy number gradient that is enriched near the replication origin and gradually declines toward the terminus. Therefore, if genome sequencing is performed with appropriate timing, the short-read coverage reflects this copy number gradient, providing information about the contig positions relative to the replication origin and terminus.ResultsWe therefore investigated the appropriate timing for genomic DNA sampling and developed an algorithm for the reordering of the contigs based on eRP. As a result, this strategy successfully recapitulates the genomic structure of various structural mutants with draft genome sequencing.ConclusionsOur strategy was successful for contig rearrangement with intracellular DNA replication behavior mechanisms and can be applied to almost all bacteria because the DNA replication system is highly conserved. Therefore, eRP makes it possible to understand genomic structural information and long-range syntenic relationships using a draft genome that is based on short reads.
4 Fast and robust adjustment of cell mixtures in epigenome-wide association studies with SmartSVA [期刊论文]
BMC Genomics,2017年
Xihong Lin, Liming Liang, Ehsan Behnam, Jun Chen, Daniel J. Schaid, Miriam F. Moffatt, Jinyan Huang
LicenseType:CC BY |
BackgroundOne problem that plagues epigenome-wide association studies is the potential confounding due to cell mixtures when purified target cells are not available. Reference-free adjustment of cell mixtures has become increasingly popular due to its flexibility and simplicity. However, existing methods are still not optimal: increased false positive rates and reduced statistical power have been observed in many scenarios.MethodsWe develop SmartSVA, an optimized surrogate variable analysis (SVA) method, for fast and robust reference-free adjustment of cell mixtures. SmartSVA corrects the limitation of traditional SVA under highly confounded scenarios by imposing an explicit convergence criterion and improves the computational efficiency for large datasets.ResultsCompared to traditional SVA, SmartSVA achieves an order-of-magnitude speedup and better false positive control. It protects the signals when capturing the cell mixtures, resulting in significant power increase while controlling for false positives. Through extensive simulations and real data applications, we demonstrate a better performance of SmartSVA than the existing methods.ConclusionsSmartSVA is a fast and robust method for reference-free adjustment of cell mixtures for epigenome-wide association studies. As a general method, SmartSVA can be applied to other genomic studies to capture unknown sources of variability.
BMC Genomics,2017年
Pablo Moreno, C. Geoffrey Woods, Michael Nahorski, Kaitlin Stouffer, Nivedita Sarveswaran, Michael Lee, David Menon
LicenseType:CC BY |
BackgroundSignificant human diseases/phenotypes exist which require both an environmental trigger event and a genetic predisposition before the disease/phenotype emerges, e.g. Carbamazepine with the rare SNP allele of rs3909184 causing Stevens Johnson syndrome, and aminoglycosides with rs267606617 causing sensory neural deafness. The underlying genotypes are fully penetrant only when the correct environmental trigger(s) occur, otherwise they are silent and harmless. Such diseases/phenotypes will not appear to have a Mendelian inheritance pattern, unless the environmental trigger is very common (>50% per lifetime). The known causative genotypes are likely to be protein-altering SNPs with dominant/semi-dominant effect. We questioned whether other diseases and phenotypes could have a similar aetiology.MethodsWe wrote the fSNPd program to analyse multiple exomes from a test cohort simultaneously with the purpose of identifying SNP alleles at a significantly different frequency to that of the general population. fSNPd was tested on trial cohorts, iteratively improved, and modelled for performance against an idealised association study under mutliple parameters. We also assessed the seqeuncing depath of all human exons to determine which were sufficiently well sequenced in an exome to be sued by fSNPd - by assessing forty exomes base by base.ResultsWe describe a simple methodology for the detection of SNPs capable of causing a phenotype triggered by an environmental event. This uses cohorts of relatively small size (30–100 individuals) with the phenotype being investigated, their exomes, and thence seeks SNP allele frequencies significantly different from expected to identify potentially clinically important, protein altering SNP alleles. The strengths and weaknesses of this approach for discovering significant genetic causes of human disease are comparable to Mendelian disease mutation detection and Association Studies.ConclusionsThe fSNPd methodology is another approach, and has potentially significant advantage over Association studies in needing far fewer individuals, to detect genes involved in the pathogenesis of a diseases/phenotypes. Furthermore, the SNP alleles identified alter amino acids, potentially making it easier to devise functional assays of protein function to determine pathogenicity.
6 Prediction of bacterial small RNAs in the RsmA (CsrA) and ToxT pathways: a machine learning approach [期刊论文]
BMC Genomics,2017年
Carl Tony Fakhry, Ping Chen, Kourosh Zarringhalam, Prajna Kulkarni, Rahul Kulkarni
LicenseType:CC BY |
BackgroundSmall RNAs (sRNAs) constitute an important class of post-transcriptional regulators that control critical cellular processes in bacteria. Recent research using high-throughput transcriptomic approaches has led to a dramatic increase in the discovery of bacterial sRNAs. However, it is generally believed that the currently identified sRNAs constitute a limited subset of the bacterial sRNA repertoire. In several cases, sRNAs belonging to a specific class are already known and the challenge is to identify additional sRNAs belonging to the same class. In such cases, machine-learning approaches can be used to predict novel sRNAs in a given class.MethodsIn this work, we develop novel bioinformatics approaches that integrate sequence and structure-based features to train machine-learning models for the discovery of bacterial sRNAs. We show that features derived from recurrent structural motifs in the ensemble of low energy secondary structures can distinguish the RNA classes with high accuracy.ResultsWe apply this approach to predict new members in two broad classes of bacterial small RNAs: 1) sRNAs that bind to the RNA-binding protein RsmA/CsrA in diverse bacterial species and 2) sRNAs regulated by the master regulator of virulence, ToxT, in Vibrio cholerae.ConclusionThe involvement of sRNAs in bacterial adaptation to changing environments is an increasingly recurring theme in current research in microbiology. It is likely that future research, combining experimental and computational approaches, will discover many more examples of sRNAs as components of critical regulatory pathways in bacteria. We have developed a novel approach for prediction of small RNA regulators in important bacterial pathways. This approach can be applied to specific classes of sRNAs for which several members have been identified and the challenge is to identify additional sRNAs.