| BMC Bioinformatics | |
| htsint: a Python library for sequencing pipelines that combines data through gene set generation | |
| Software | |
| Adam J. Richards1  Camille Bonneaud2  Anthony Herrel3  | |
| [1] Station d’Ecologie Expérimentale du CNRS, USR 2936, Route du CNRS, 09200, Moulis, France;Station d’Ecologie Expérimentale du CNRS, USR 2936, Route du CNRS, 09200, Moulis, France;Centre for Ecology & Conservation, College of Life and Environmental Sciences, University of Exeter, Penryn TR10 9FE, Cornwall, UK;UMR 7179 CNRS/MNHN, Département d’Ecologie et de Gestion de la Biodiversité 57 rue Cuvier, Case postale 55, 75231, Paris, France;Ghent University, Evolutionary Morphology of Vertebrates, K.L. Ledeganckstraat 35, B-9000, Ghent, Belgium; | |
| 关键词: Gene set analysis; Gene ontology; RNA-Seq; | |
| DOI : 10.1186/s12859-015-0729-3 | |
| received in 2015-03-25, accepted in 2015-09-08, 发布年份 2015 | |
| 来源: Springer | |
PDF
|
|
【 摘 要 】
BackgroundSequencing technologies provide a wealth of details in terms of genes, expression, splice variants, polymorphisms, and other features. A standard for sequencing analysis pipelines is to put genomic or transcriptomic features into a context of known functional information, but the relationships between ontology terms are often ignored. For RNA-Seq, considering genes and their genetic variants at the group level enables a convenient way to both integrate annotation data and detect small coordinated changes between experimental conditions, a known caveat of gene level analyses.ResultsWe introduce the high throughput data integration tool, htsint, as an extension to the commonly used gene set enrichment frameworks. The central aim of htsint is to compile annotation information from one or more taxa in order to calculate functional distances among all genes in a specified gene space. Spectral clustering is then used to partition the genes, thereby generating functional modules. The gene space can range from a targeted list of genes, like a specific pathway, all the way to an ensemble of genomes. Given a collection of gene sets and a count matrix of transcriptomic features (e.g. expression, polymorphisms), the gene sets produced by htsint can be tested for ‘enrichment’ or conditional differences using one of a number of commonly available packages.ConclusionThe database and bundled tools to generate functional modules were designed with sequencing pipelines in mind, but the toolkit nature of htsint allows it to also be used in other areas of genomics. The software is freely available as a Python library through GitHub at https://github.com/ajrichards/htsint.
【 授权许可】
CC BY
© Richards et al. 2015
【 预 览 】
| Files | Size | Format | View |
|---|---|---|---|
| RO202311101474441ZK.pdf | 873KB |
【 参考文献 】
- [1]
- [2]
- [3]
- [4]
- [5]
- [6]
- [7]
- [8]
- [9]
- [10]
- [11]
- [12]
- [13]
- [14]
- [15]
- [16]
- [17]
- [18]
- [19]
- [20]
- [21]
- [22]
- [23]
- [24]
- [25]
- [26]
- [27]
- [28]
- [29]
- [30]
- [31]
- [32]
- [33]
- [34]
PDF