学位论文详细信息
Methods for Identifying Variation in Large-Scale Genomic Data
computational genomics;compression;graph genome;alignment;copy number analysis;Computer Science
Pritt, Mark JacobLangmead, Benjamin ;
Johns Hopkins University
关键词: computational genomics;    compression;    graph genome;    alignment;    copy number analysis;    Computer Science;   
Others  :  https://jscholarship.library.jhu.edu/bitstream/handle/1774.2/60131/PRITT-DISSERTATION-2018.pdf?sequence=1&isAllowed=y
瑞士|英语
来源: JOHNS HOPKINS DSpace Repository
PDF
【 摘 要 】
The rise of next-generation sequencing has produced an abundance of data with almost limitless analysis applications. As sequencing technology decreases in cost and increases in throughput, the amount of available data is quickly outpacing improve- ments in processor speed. Analysis methods must also increase in scale to remain computationally tractable. At the same time, larger datasets and the availability of population-wide data offer a broader context with which to improve accuracy.This thesis presents three tools that improve the scalability of sequencing data storage and analysis. First, a lossy compression method for RNA-seq alignments offers extreme size reduction without compromising downstream accuracy of isoform assembly and quantitation. Second, I describe a graph genome analysis tool that filters population variants for optimal aligner performance. Finally, I offer several methods for improving CNV segmentation accuracy, including borrowing strength across samples to overcome the limitations of low coverage. These methods compose a practical toolkit for improving the computational power of genomic analysis.
【 预 览 】
附件列表
Files Size Format View
Methods for Identifying Variation in Large-Scale Genomic Data 3230KB PDF download
  文献评价指标  
  下载次数:15次 浏览次数:69次