期刊论文详细信息
PeerJ
minSNPs: an R package for the derivation of resolution-optimised SNP sets from microbial genomic data
article
Kian Soon Hoon1  Deborah C. Holt1  Sarah Auburn1  Peter Shaw5  Philip M. Giffard1 
[1] Menzies School of Health Research, Charles Darwin University;CDU Menzies School of Medicine, Faculty of Health, Charles Darwin University;Mahidol-Oxford Tropical Medicine Research Unit, Mahidol University;Centre for Tropical Medicine and Global Health, University of Oxford;Oujian Laboratory
关键词: SNPs;    Genome;    Microbial;    SNP mining;    SNP genotyping;    Staphylococcus;    Plasmodium;    SNP matrices;    Resolution optimised;    Genome alignments;   
DOI  :  10.7717/peerj.15339
学科分类:社会科学、人文和艺术(综合)
来源: Inra
PDF
【 摘 要 】

Here, we present the R package, minSNPs. This is a re-development of a previously described Java application named Minimum SNPs. MinSNPs assembles resolution-optimised sets of single nucleotide polymorphisms (SNPs) from sequence alignments such as genome-wide orthologous SNP matrices. MinSNPs can derive sets of SNPs optimised for discriminating any user-defined combination of sequences from all others. Alternatively, SNP sets may be optimised to determine all sequences from all other sequences, i.e., to maximise diversity. MinSNPs encompasses functions that facilitate rapid and flexible SNP mining, and clear and comprehensive presentation of the results. The minSNPs’ running time scales in a linear fashion with input data volume and the numbers of SNPs and SNPs sets specified in the output. MinSNPs was tested using a previously reported orthologous SNP matrix of Staphylococcus aureus and an orthologous SNP matrix of 3,279 genomes with 164,335 SNPs assembled from four S. aureus short read genomic data sets. MinSNPs was shown to be effective for deriving discriminatory SNP sets for potential surveillance targets and in identifying SNP sets optimised to discriminate isolates from different clonal complexes. MinSNPs was also tested with a large Plasmodium vivax orthologous SNP matrix. A set of five SNPs was derived that reliably indicated the country of origin within three south-east Asian countries. In summary, we report the capacity to assemble comprehensive SNP matrices that effectively capture microbial genomic diversity, and to rapidly and flexibly mine these entities for optimised marker sets.

【 授权许可】

CC BY   

【 预 览 】
附件列表
Files Size Format View
RO202307100002063ZK.pdf 723KB PDF download
  文献评价指标  
  下载次数:1次 浏览次数:1次