期刊论文详细信息
BMC Bioinformatics
CaPSID: A bioinformatics platform for computational pathogen sequence identification in human genomes and transcriptomes
Software
Paola Blanchette1  Philip E Branton2  Fabrice Sircoulomb3  Robert Rottapel3  Paul M Krzyzanowski3  Vincent Ferretti4  Ivan Borozan4  Philippe Laflamme4  Stuart N Watt4  Shane Wilson4 
[1] Department of Biochemistry, McGill University, McIntyre Medical Building, 3655 Promenade Sir William Osler, H3G 1Y6, Montreal, Quebec, Canada;Department of Biochemistry, McGill University, McIntyre Medical Building, 3655 Promenade Sir William Osler, H3G 1Y6, Montreal, Quebec, Canada;Department of Oncology, McGill University, McIntyre Medical Building, 3655 Promenade Sir William Osler, H3G 1Y6, Montreal, Quebec, Canada;The Goodman Cancer Research Centre, McGill University, McIntyre Medical Building, 3655 Promenade Sir William Osler, H3G 1Y6, Montreal, Quebec, Canada;Ontario Cancer Institute and the Campbell Family Cancer Research Institute, Toronto Medical Discovery Tower, University of Toronto, 101 College Street, Rm 8-703, M5G 1L7, Toronto, Ontario, Canada;Ontario Institute for Cancer Research, MaRS Centre, South Tower, 101 College Street, Suite 800, M5G 0A3, Toronto, Ontario, Canada;
关键词: Reference Sequence;    Simulated Dataset;    Benchmark Dataset;    OVCA0016 Cell;    Short Read Sequence;   
DOI  :  10.1186/1471-2105-13-206
 received in 2012-02-01, accepted in 2012-07-18,  发布年份 2012
来源: Springer
PDF
【 摘 要 】

BackgroundIt is now well established that nearly 20% of human cancers are caused by infectious agents, and the list of human oncogenic pathogens will grow in the future for a variety of cancer types. Whole tumor transcriptome and genome sequencing by next-generation sequencing technologies presents an unparalleled opportunity for pathogen detection and discovery in human tissues but requires development of new genome-wide bioinformatics tools.ResultsHere we present CaPSID (Computational Pathogen Sequence IDentification), a comprehensive bioinformatics platform for identifying, querying and visualizing both exogenous and endogenous pathogen nucleotide sequences in tumor genomes and transcriptomes. CaPSID includes a scalable, high performance database for data storage and a web application that integrates the genome browser JBrowse. CaPSID also provides useful metrics for sequence analysis of pre-aligned BAM files, such as gene and genome coverage, and is optimized to run efficiently on multiprocessor computers with low memory usage.ConclusionsTo demonstrate the usefulness and efficiency of CaPSID, we carried out a comprehensive analysis of both a simulated dataset and transcriptome samples from ovarian cancer. CaPSID correctly identified all of the human and pathogen sequences in the simulated dataset, while in the ovarian dataset CaPSID’s predictions were successfully validated in vitro.

【 授权许可】

Unknown   
© Borozan et al.; licensee BioMed Central Ltd. 2012. This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

【 预 览 】
附件列表
Files Size Format View
RO202311108909803ZK.pdf 1823KB PDF download
【 参考文献 】
  • [1]
  • [2]
  • [3]
  • [4]
  • [5]
  • [6]
  • [7]
  • [8]
  • [9]
  • [10]
  • [11]
  • [12]
  • [13]
  • [14]
  • [15]
  • [16]
  • [17]
  • [18]
  • [19]
  • [20]
  • [21]
  • [22]
  • [23]
  • [24]
  • [25]
  • [26]
  • [27]
  • [28]
  文献评价指标  
  下载次数:6次 浏览次数:0次