期刊论文详细信息
Genome Biology
Comparison of gene clustering criteria reveals intrinsic uncertainty in pangenome analyses
Research
Sara González-Bodí1  Jaime Huerta-Cepas1  Saioa Manzano-Morales2  Yang Liu3  Jaime Iranzo4 
[1] Centro de Biotecnología y Genómica de Plantas, Universidad Politécnica de Madrid (UPM) - Instituto Nacional de Investigación y Tecnología Agraria y Alimentaria (INIA-CSIC), Madrid, Spain;Centro de Biotecnología y Genómica de Plantas, Universidad Politécnica de Madrid (UPM) - Instituto Nacional de Investigación y Tecnología Agraria y Alimentaria (INIA-CSIC), Madrid, Spain;Barcelona Supercomputing Centre (BSC-CNS) - Institute for Research in Biomedicine (IRB Barcelona), The Barcelona Institute of Science and Technology, Barcelona, Spain;Centro de Biotecnología y Genómica de Plantas, Universidad Politécnica de Madrid (UPM) - Instituto Nacional de Investigación y Tecnología Agraria y Alimentaria (INIA-CSIC), Madrid, Spain;Guangdong Province Key Laboratory of Microbial Signals and Disease Control, Integrative Microbiology Research Centre, South China Agricultural University, Guangzhou, China;Centro de Biotecnología y Genómica de Plantas, Universidad Politécnica de Madrid (UPM) - Instituto Nacional de Investigación y Tecnología Agraria y Alimentaria (INIA-CSIC), Madrid, Spain;Institute for Biocomputation and Physics of Complex Systems (BIFI), University of Zaragoza, Zaragoza, Spain;
关键词: Pangenome;    Orthology;    Comparative genomics;    Homology;    Core gene;    Accessory genome;    Genome plasticity;    MAG;   
DOI  :  10.1186/s13059-023-03089-3
 received in 2022-12-19, accepted in 2023-10-16,  发布年份 2023
来源: Springer
PDF
【 摘 要 】

BackgroundA key step for comparative genomics is to group open reading frames into functionally and evolutionarily meaningful gene clusters. Gene clustering is complicated by intraspecific duplications and horizontal gene transfers that are frequent in prokaryotes. In consequence, gene clustering methods must deal with a trade-off between identifying vertically transmitted representatives of multicopy gene families, which are recognizable by synteny conservation, and retrieving complete sets of species-level orthologs. We studied the implications of adopting homology, orthology, or synteny conservation as formal criteria for gene clustering by performing comparative analyses of 125 prokaryotic pangenomes.ResultsClustering criteria affect pangenome functional characterization, core genome inference, and reconstruction of ancestral gene content to different extents. Species-wise estimates of pangenome and core genome sizes change by the same factor when using different clustering criteria, allowing robust cross-species comparisons regardless of the clustering criterion. However, cross-species comparisons of genome plasticity and functional profiles are substantially affected by inconsistencies among clustering criteria. Such inconsistencies are driven not only by mobile genetic elements, but also by genes involved in defense, secondary metabolism, and other accessory functions. In some pangenome features, the variability attributed to methodological inconsistencies can even exceed the effect sizes of ecological and phylogenetic variables.ConclusionsChoosing an appropriate criterion for gene clustering is critical to conduct unbiased pangenome analyses. We provide practical guidelines to choose the right method depending on the research goals and the quality of genome assemblies, and a benchmarking dataset to assess the robustness and reproducibility of future comparative studies.

【 授权许可】

CC BY   
© The Author(s) 2023

【 预 览 】
附件列表
Files Size Format View
RO202311109443397ZK.pdf 2771KB PDF download
Fig. 1 734KB Image download
Fig. 5 893KB Image download
Fig. 1 378KB Image download
601KB Image download
Fig. 2 126KB Image download
Fig. 2 326KB Image download
MediaObjects/13046_2023_2865_MOESM5_ESM.tif 16266KB Other download
MediaObjects/41408_2023_931_MOESM1_ESM.docx 75KB Other download
Fig. 4 393KB Image download
Fig. 4 1257KB Image download
Fig. 1 723KB Image download
Fig. 5 246KB Image download
12936_2017_2051_Article_IEq71.gif 1KB Image download
Fig. 1 433KB Image download
12888_2023_5299_Article_IEq1.gif 1KB Image download
Fig. 6 393KB Image download
Fig. 8 510KB Image download
Fig. 1 1965KB Image download
12888_2023_5299_Article_IEq2.gif 1KB Image download
12888_2023_5299_Article_IEq3.gif 1KB Image download
12888_2023_5299_Article_IEq4.gif 1KB Image download
12888_2023_5299_Article_IEq5.gif 1KB Image download
12888_2023_5299_Article_IEq6.gif 1KB Image download
MediaObjects/12888_2023_5299_MOESM1_ESM.xlsx 10KB Other download
MediaObjects/12888_2023_5299_MOESM2_ESM.xlsx 11KB Other download
MediaObjects/12888_2023_5299_MOESM3_ESM.xlsx 9KB Other download
Fig. 2 1021KB Image download
Fig. 2 166KB Image download
Fig. 1 1647KB Image download
MediaObjects/40560_2023_692_MOESM7_ESM.docx 20KB Other download
MediaObjects/12888_2023_5209_MOESM4_ESM.docx 57KB Other download
【 图 表 】

Fig. 1

Fig. 2

Fig. 2

12888_2023_5299_Article_IEq6.gif

12888_2023_5299_Article_IEq5.gif

12888_2023_5299_Article_IEq4.gif

12888_2023_5299_Article_IEq3.gif

12888_2023_5299_Article_IEq2.gif

Fig. 1

Fig. 8

Fig. 6

12888_2023_5299_Article_IEq1.gif

Fig. 1

12936_2017_2051_Article_IEq71.gif

Fig. 5

Fig. 1

Fig. 4

Fig. 4

Fig. 2

Fig. 2

Fig. 1

Fig. 5

Fig. 1

【 参考文献 】
  • [1]
  • [2]
  • [3]
  • [4]
  • [5]
  • [6]
  • [7]
  • [8]
  • [9]
  • [10]
  • [11]
  • [12]
  • [13]
  • [14]
  • [15]
  • [16]
  • [17]
  • [18]
  • [19]
  • [20]
  • [21]
  • [22]
  • [23]
  • [24]
  • [25]
  • [26]
  • [27]
  • [28]
  • [29]
  • [30]
  • [31]
  • [32]
  • [33]
  • [34]
  • [35]
  • [36]
  • [37]
  • [38]
  • [39]
  • [40]
  • [41]
  • [42]
  • [43]
  • [44]
  • [45]
  • [46]
  • [47]
  • [48]
  • [49]
  • [50]
  • [51]
  • [52]
  • [53]
  • [54]
  • [55]
  • [56]
  • [57]
  • [58]
  • [59]
  • [60]
  • [61]
  • [62]
  • [63]
  • [64]
  • [65]
  • [66]
  • [67]
  • [68]
  • [69]
  • [70]
  • [71]
  • [72]
  • [73]
  • [74]
  • [75]
  • [76]
  • [77]
  • [78]
  • [79]
  • [80]
  • [81]
  文献评价指标  
  下载次数:4次 浏览次数:0次