期刊论文详细信息
PeerJ
The regulatory genome constrains protein sequence evolution: implications for the search for disease-associated genes
article
Patrick Evans1  Nancy J. Cox1  Eric R. Gamazon1 
[1] Division of Genetic Medicine, Vanderbilt University Medical Center;Clare Hall, University of Cambridge;MRC Epidemiology Unit, University of Cambridge;Data Science Institute, Vanderbilt University
关键词: Evolution;    Transcriptome;    Genomics;    TWAS;    GWAS;    Developmental disorder;    Mendelian disease;    Complex traits;    PrediXcan;    GTEx;   
DOI  :  10.7717/peerj.9554
学科分类:社会科学、人文和艺术(综合)
来源: Inra
PDF
【 摘 要 】

The development of explanatory models of protein sequence evolution has broad implications for our understanding of cellular biology, population history, and disease etiology. Here we analyze the GTEx transcriptome resource to quantify the effect of the transcriptome on protein sequence evolution in a multi-tissue framework. We find substantial variation among the central nervous system tissues in the effect of expression variance on evolutionary rate, with highly variable genes in the cortex showing significantly greater purifying selection than highly variable genes in subcortical regions (Mann–Whitney U p = 1.4 × 10−4). The remaining tissues cluster in observed expression correlation with evolutionary rate, enabling evolutionary analysis of genes in diverse physiological systems, including digestive, reproductive, and immune systems. Importantly, the tissue in which a gene attains its maximum expression variance significantly varies (p = 5.55 × 10−284) with evolutionary rate, suggesting a tissue-anchored model of protein sequence evolution. Using a large-scale reference resource, we show that the tissue-anchored model provides a transcriptome-based approach to predicting the primary affected tissue of developmental disorders. Using gradient boosted regression trees to model evolutionary rate under a range of model parameters, selected features explain up to 62% of the variation in evolutionary rate and provide additional support for the tissue model. Finally, we investigate several methodological implications, including the importance of evolutionary-rate-aware gene expression imputation models using genetic data for improved search for disease-associated genes in transcriptome-wide association studies. Collectively, this study presents a comprehensive transcriptome-based analysis of a range of factors that may constrain molecular evolution and proposes a novel framework for the study of gene function and disease mechanism.

【 授权许可】

CC BY   

【 预 览 】
附件列表
Files Size Format View
RO202307100007850ZK.pdf 467KB PDF download
  文献评价指标  
  下载次数:10次 浏览次数:1次