期刊论文详细信息
PeerJ
K -mer-based machine learning method to classify LTR-retrotransposons in plant genomes
article
Simon Orozco-Arias1  Mariana S. Candamil-Cortés1  Paula A. Jaimes1  Johan S. Piña1  Reinel Tabares-Soto3  Romain Guyot3  Gustavo Isaza2 
[1] Department of Computer Science, Universidad Autónoma de Manizales;Department of Systems and Informatics, Universidad de Caldas;Department of Electronics and Automation, Universidad Autónoma de Manizales;Institut de Recherche pour le Développement
关键词: Transposable elements;    LTR retrotransposons;    Plant genomes;    Machine learning;    Classification;    Free-alignment approach;    k-mer based method;   
DOI  :  10.7717/peerj.11456
学科分类:社会科学、人文和艺术(综合)
来源: Inra
PDF
【 摘 要 】

Every day more plant genomes are available in public databases and additional massive sequencing projects (i.e., that aim to sequence thousands of individuals) are formulated and released. Nevertheless, there are not enough automatic tools to analyze this large amount of genomic information. LTR retrotransposons are the most frequent repetitive sequences in plant genomes; however, their detection and classification are commonly performed using semi-automatic and time-consuming programs. Despite the availability of several bioinformatic tools that follow different approaches to detect and classify them, none of these tools can individually obtain accurate results. Here, we used Machine Learning algorithms based on k-mer counts to classify LTR retrotransposons from other genomic sequences and into lineages/families with an F1-Score of 95%, contributing to develop a free-alignment and automatic method to analyze these sequences.

【 授权许可】

CC BY   

【 预 览 】
附件列表
Files Size Format View
RO202307100005980ZK.pdf 1948KB PDF download
  文献评价指标  
  下载次数:6次 浏览次数:1次