会议论文详细信息
Pacific Symposium on Biocomputing 2004
Accurate Classific Ation Of ProteinStructural Families Using Coherent Subgraph Analysis
J Huan ; W Wang ; A Washington ; J Prins ; R Shah ; A Tropsha
PID  :  13870
来源: CEUR
PDF
【 摘 要 】

Proteinstructuralannotationandclassificationisanimportantproblemin bioinformatics. We reportonthedevelopmentofanefficientsubgraphmining techniqueanditsapplicationtofindingcharacteristicsubstructuralpatternswithin protein structural families. In our method, proteinstructuresarerepresentedbygraphs wherethenodesareresiduesandtheedgesconnectresiduesfoundwithincertain distance from each other.Application of subgraph mining to proteins ischallengingfor anumberreasons:(1)proteingraphsarelargeandcomplex,(2)currentprotein databases are large and continue togrowrapidly,and(3)onlyasmallfractionofthe frequent subgraphs among the huge pool of allpossiblesubgraphscouldbesignificant in the context of protein classification. To address these challenges,wehavedevelopedaninformation theoreticmodel calledcoherentsubgraph mining.From information theory,theentropyofarandom variable X measures the information content carriedbyXandthe MutualInformation (MI) between two random variables X and Y measures thecorrelationbetweenXand Y.WedefineasubgraphXascoherentifitisstronglycorrelatedwithevery sufficiently largesub-subgraphYembeddedin it.Basedon the MImetric,wehave designed a search scheme that only reports coherent subgraphs. To determine the significance of coherent proteinsubgraphs,wehaveconducted anexperimentalstudyinwhichallcoherentsubgraphswereidentifiedinseveral proteinstructuralfamiliesannotatedin theSCOPdatabase(Murzinetal,1995).The Support Vector Machine algorithm was used to classify proteins fromdifferentfamilies underthebinaryclassificationscheme.Wefind that thisapproachidentifiesspatial motifs unique to individual SCOP families and affordsexcellentdiscriminationbetweenfamilies.

【 预 览 】
附件列表
Files Size Format View
Accurate Classific Ation Of ProteinStructural Families Using Coherent Subgraph Analysis 312KB PDF download
  文献评价指标  
  下载次数:2次 浏览次数:28次