期刊论文详细信息
BMC Genomics
Short tandem repeat number estimation from paired-end reads for multiple individuals by considering coalescent tree
Research
Naoki Nariai1  Takahiro Mimori2  Kaname Kojima2  Yosuke Kawai2  Masao Nagasaki2  Takanori Hasegawa2 
[1] Institute for Genomic Medicine, University of California, San Diego, 9500 Gilman Drive #0761, 92093-0761, San Diego, USA;Tohoku Medical Megabank Organization, Tohoku University, 2-1, Seiryo-machi, Aoba-ku, 980-8573, Sendai, Japan;
关键词: High-throughput sequencing;    Short tandem repeat;    Coalescent theory;   
DOI  :  10.1186/s12864-016-2821-0
来源: Springer
PDF
【 摘 要 】

BackgroundTwo types of approaches are mainly considered for the repeat number estimation in short tandem repeat (STR) regions from high-throughput sequencing data: approaches directly counting repeat patterns included in sequence reads spanning the region and approaches based on detecting the difference between the insert size inferred from aligned paired-end reads and the actual insert size. Although the accuracy of repeat numbers estimated with the former approaches is high, the size of target STR regions is limited to the length of sequence reads. On the other hand, the latter approaches can handle STR regions longer than the length of sequence reads. However, repeat numbers estimated with the latter approaches is less accurate than those with the former approaches.ResultsWe proposed a new statistical model named coalescentSTR that estimates repeat numbers from paired-end read distances for multiple individuals simultaneously by connecting the read generative model for each individual with their genealogy. In the model, the genealogy is represented by handling coalescent trees as hidden variables, and the summation of the hidden variables is taken on coalescent trees sampled based on phased genotypes located around a target STR region with Markov chain Monte Carlo. In the sampled coalescent trees, repeat number information from insert size data is propagated, and more accurate estimation of repeat numbers is expected for STR regions longer than the length of sequence reads.For finding the repeat numbers maximizing the likelihood of the model on the estimation of repeat numbers, we proposed a state-of-the-art belief propagation algorithm on sampled coalescent trees.ConclusionsWe verified the effectiveness of the proposed approach from the comparison with existing methods by using simulation datasets and real whole genome and whole exome data for HapMap individuals analyzed in the 1000 Genomes Project.

【 授权许可】

CC BY   
© The Author(s) 2016

【 预 览 】
附件列表
Files Size Format View
RO202311107802405ZK.pdf 648KB PDF download
MediaObjects/41408_2023_927_MOESM1_ESM.png 1051KB Other download
Fig. 10 1239KB Image download
Fig. 2 86KB Image download
Fig. 1 2460KB Image download
Fig. 3 305KB Image download
Fig. 3 77KB Image download
12937_2016_133_Article_IEq1.gif 1KB Image download
Fig. 4 62KB Image download
Fig. 4 79KB Image download
MediaObjects/13068_2023_2399_MOESM4_ESM.xlsx 12KB Other download
Fig. 6 54KB Image download
Fig. 5 91KB Image download
Fig. 3 254KB Image download
Fig. 6 90KB Image download
12951_2017_292_Article_IEq1.gif 1KB Image download
12951_2015_155_Article_IEq62.gif 1KB Image download
MediaObjects/13046_2022_2359_MOESM2_ESM.docx 15KB Other download
Fig. 2 1305KB Image download
Fig. 1 1997KB Image download
Fig. 1 690KB Image download
Fig. 5 90KB Image download
Fig. 2 661KB Image download
Fig. 6 118KB Image download
Fig. 5 2831KB Image download
Fig. 4 2788KB Image download
505KB Image download
Fig. 1 61KB Image download
Fig. 1 357KB Image download
MediaObjects/40360_2023_695_MOESM1_ESM.docx 12962KB Other download
Fig. 2 1255KB Image download
Fig. 6 4844KB Image download
Fig. 2 78KB Image download
Fig. 4 244KB Image download
Fig. 1 67KB Image download
12951_2017_255_Article_IEq40.gif 1KB Image download
12936_2016_1182_Article_IEq39.gif 1KB Image download
MediaObjects/12947_2023_317_MOESM1_ESM.docx 420KB Other download
Fig. 2 2049KB Image download
Fig. 3 1017KB Image download
Fig. 1 300KB Image download
Fig. 1 171KB Image download
Fig. 2 58KB Image download
Fig. 2 358KB Image download
12936_2017_2045_Article_IEq18.gif 1KB Image download
Fig. 4 1866KB Image download
Fig. 2 1452KB Image download
12936_2017_2014_Article_IEq29.gif 1KB Image download
Fig. 2 209KB Image download
12936_2015_836_Article_IEq13.gif 1KB Image download
12951_2017_255_Article_IEq41.gif 1KB Image download
12936_2015_836_Article_IEq14.gif 1KB Image download
MediaObjects/13046_2023_2862_MOESM5_ESM.png 235KB Other download
MediaObjects/12974_2023_2923_MOESM1_ESM.docx 3913KB Other download
Fig. 1 110KB Image download
MediaObjects/13046_2023_2862_MOESM6_ESM.png 301KB Other download
Fig. 2 101KB Image download
Fig. 1 344KB Image download
Fig. 3 512KB Image download
Fig. 5 144KB Image download
Fig. 4 1156KB Image download
12951_2015_155_Article_IEq65.gif 1KB Image download
12951_2016_177_Article_IEq1.gif 1KB Image download
【 图 表 】

12951_2016_177_Article_IEq1.gif

12951_2015_155_Article_IEq65.gif

Fig. 4

Fig. 5

Fig. 3

Fig. 1

Fig. 2

Fig. 1

12936_2015_836_Article_IEq14.gif

12951_2017_255_Article_IEq41.gif

12936_2015_836_Article_IEq13.gif

Fig. 2

12936_2017_2014_Article_IEq29.gif

Fig. 2

Fig. 4

12936_2017_2045_Article_IEq18.gif

Fig. 2

Fig. 2

Fig. 1

Fig. 1

Fig. 3

Fig. 2

12936_2016_1182_Article_IEq39.gif

12951_2017_255_Article_IEq40.gif

Fig. 1

Fig. 4

Fig. 2

Fig. 6

Fig. 2

Fig. 1

Fig. 1

Fig. 4

Fig. 5

Fig. 6

Fig. 2

Fig. 5

Fig. 1

Fig. 1

Fig. 2

12951_2015_155_Article_IEq62.gif

12951_2017_292_Article_IEq1.gif

Fig. 6

Fig. 3

Fig. 5

Fig. 6

Fig. 4

Fig. 4

12937_2016_133_Article_IEq1.gif

Fig. 3

Fig. 3

Fig. 1

Fig. 2

Fig. 10

【 参考文献 】
  • [1]
  • [2]
  • [3]
  • [4]
  • [5]
  • [6]
  • [7]
  • [8]
  • [9]
  • [10]
  • [11]
  • [12]
  • [13]
  • [14]
  • [15]
  • [16]
  • [17]
  • [18]
  • [19]
  • [20]
  • [21]
  • [22]
  • [23]
  • [24]
  • [25]
  • [26]
  文献评价指标  
  下载次数:1次 浏览次数:0次