期刊论文详细信息
PeerJ
RelocaTE2: a high resolution transposable element insertion site mapping tool for population resequencing
article
Jinfeng Chen1  Travis R. Wrightsman3  Susan R. Wessler2  Jason E. Stajich1 
[1] Department of Plant Pathology & Microbiology, University of California;Institute for Integrative Genome Biology, University of California;Department of Botany and Plant Sciences, University of California
关键词: Annotation;    Diversity;    Parallel processing;    Transposons;    Population genomics;    Short read;    Bioinformatics;    Rice;    Resequencing;   
DOI  :  10.7717/peerj.2942
学科分类:社会科学、人文和艺术(综合)
来源: Inra
PDF
【 摘 要 】

BackgroundTransposable element (TE) polymorphisms are important components of population genetic variation. The functional impacts of TEs in gene regulation and generating genetic diversity have been observed in multiple species, but the frequency and magnitude of TE variation is under appreciated. Inexpensive and deep sequencing technology has made it affordable to apply population genetic methods to whole genomes with methods that identify single nucleotide and insertion/deletion polymorphisms. However, identifying TE polymorphisms, particularly transposition events or non-reference insertion sites can be challenging due to the repetitive nature of these sequences, which hamper both the sensitivity and specificity of analysis tools.MethodsWe have developed the tool RelocaTE2 for identification of TE insertion sites at high sensitivity and specificity. RelocaTE2 searches for known TE sequences in whole genome sequencing reads from second generation sequencing platforms such as Illumina. These sequence reads are used as seeds to pinpoint chromosome locations where TEs have transposed. RelocaTE2 detects target site duplication (TSD) of TE insertions allowing it to report TE polymorphism loci with single base pair precision.Results and DiscussionThe performance of RelocaTE2 is evaluated using both simulated and real sequence data. RelocaTE2 demonstrate high level of sensitivity and specificity, particularly when the sequence coverage is not shallow. In comparison to other tools tested, RelocaTE2 achieves the best balance between sensitivity and specificity. In particular, RelocaTE2 performs best in prediction of TSDs for TE insertions. Even in highly repetitive regions, such as those tested on rice chromosome 4, RelocaTE2 is able to report up to 95% of simulated TE insertions with less than 0.1% false positive rate using 10-fold genome coverage resequencing data. RelocaTE2 provides a robust solution to identify TE insertion sites and can be incorporated into analysis workflows in support of describing the complete genotype from light coverage genome sequencing.

【 授权许可】

CC BY   

【 预 览 】
附件列表
Files Size Format View
RO202307100014380ZK.pdf 429KB PDF download
  文献评价指标  
  下载次数:4次 浏览次数:1次