Soybean (Glycine max) is an important crop worldwide, with utilities as protein source and rotation crop. Often, soybean variant discovery is conducted using alignment methods and largely yields short variants such as SNP(s) and short indel(s). We conducted variant discovery on 481 soybean lines using both alignment and assembly methods. We used the Sentieon Haplotyper algorithm for our alignment-based variant calling and Cortex-var for our assembly-based variant discovery and found many more short variants compared to structural variants: medians 2,728,393 and 3,972 variants respectively. Additionally, we provide the user-friendly workflow script together with full-documentation for Cortex-var assembly-based variant discovery.Additionally, post variant discovery analysis was conducted with a focus on transposable element activities in the same 481 soybean lines. Structural variants were filtered for transposable element, and then the transposable elements identified were compared between elite soybean lines and Glycine soja lines, and then the transposable element activity of G. soja lines was statistically higher compared to the soybean elite lines. Consequently, it is possible to distinguish the difference between elite soybean lines, landraces, and G. soja lines through structural variant information. It was also found that some transposable elements that might be potentially disrupting genes with functions such as carbohydrate metabolism and embryo development with high activities in the leaves, pods, and pod shells.
【 预 览 】
附件列表
Files
Size
Format
View
Whole genomic structural variant calling in soybean: Analysis on 481 different soybean lines