期刊论文详细信息
BMC Bioinformatics
ClinQC: a tool for quality control and cleaning of Sanger and NGS data in clinical research
Software
Stephan Pabinger1  Andreas Weinhäusel1  Albert Kriegner1  Ram Vinay Pandey2 
[1] Health & Environment Department, Molecular Diagnostics, AIT Austrian Institute of Technology GmbH, Vienna, Austria;Health & Environment Department, Molecular Diagnostics, AIT Austrian Institute of Technology GmbH, Vienna, Austria;Institut für Populationsgenetik, Vetmeduni Vienna, Veterinärplatz 1, A-1210, Vienna, Austria;
关键词: Sanger sequencing;    Next generation sequencing;    Quality control;    Molecular diagnostic testing;   
DOI  :  10.1186/s12859-016-0915-y
 received in 2015-10-06, accepted in 2016-01-28,  发布年份 2016
来源: Springer
PDF
【 摘 要 】

BackgroundTraditional Sanger sequencing has been used as a gold standard method for genetic testing in clinic to perform single gene test, which has been a cumbersome and expensive method to test several genes in heterogeneous disease such as cancer. With the advent of Next Generation Sequencing technologies, which produce data on unprecedented speed in a cost effective manner have overcome the limitation of Sanger sequencing. Therefore, for the efficient and affordable genetic testing, Next Generation Sequencing has been used as a complementary method with Sanger sequencing for disease causing mutation identification and confirmation in clinical research. However, in order to identify the potential disease causing mutations with great sensitivity and specificity it is essential to ensure high quality sequencing data. Therefore, integrated software tools are lacking which can analyze Sanger and NGS data together and eliminate platform specific sequencing errors, low quality reads and support the analysis of several sample/patients data set in a single run.ResultsWe have developed ClinQC, a flexible and user-friendly pipeline for format conversion, quality control, trimming and filtering of raw sequencing data generated from Sanger sequencing and three NGS sequencing platforms including Illumina, 454 and Ion Torrent. First, ClinQC convert input read files from their native formats to a common FASTQ format and remove adapters, and PCR primers. Next, it split bar-coded samples, filter duplicates, contamination and low quality sequences and generates a QC report. ClinQC output high quality reads in FASTQ format with Sanger quality encoding, which can be directly used in down-stream analysis. It can analyze hundreds of sample/patients data in a single run and generate unified output files for both Sanger and NGS sequencing data. Our tool is expected to be very useful for quality control and format conversion of Sanger and NGS data to facilitate improved downstream analysis and mutation screening.ConclusionsClinQC is a powerful and easy to handle pipeline for quality control and trimming in clinical research. ClinQC is written in Python with multiprocessing capability, run on all major operating systems and is available at https://sourceforge.net/projects/clinqc.

【 授权许可】

CC BY   
© Pandey et al. 2016

【 预 览 】
附件列表
Files Size Format View
RO202311090906467ZK.pdf 1536KB PDF download
【 参考文献 】
  • [1]
  • [2]
  • [3]
  • [4]
  • [5]
  • [6]
  • [7]
  • [8]
  • [9]
  • [10]
  • [11]
  • [12]
  • [13]
  • [14]
  • [15]
  • [16]
  • [17]
  • [18]
  • [19]
  • [20]
  • [21]
  • [22]
  文献评价指标  
  下载次数:0次 浏览次数:0次