期刊论文详细信息
BMC Bioinformatics
Automatic classification of protein structures relying on similarities between alignments
Research Article
Joël Pothier1  Guillaume Santini2  Henry Soldano3 
[1] UPMC, Université Paris 06, Atelier de BioInformatique, F-75005, Paris, France;Université Paris 13, Sorbonne Paris Cité, Laboratoire d’Informatique de Paris-Nord (LIPN), CNRS(, UMR 7030), F-93430, Villetaneuse, France;Université Paris 13, Sorbonne Paris Cité, Laboratoire d’Informatique de Paris-Nord (LIPN), CNRS(, UMR 7030), F-93430, Villetaneuse, France;UPMC, Université Paris 06, Atelier de BioInformatique, F-75005, Paris, France;
关键词: Line Graph;    Classification Procedure;    Pairwise Similarity;    Pairwise Constraint;    Ternary Relation;   
DOI  :  10.1186/1471-2105-13-233
 received in 2011-11-21, accepted in 2012-08-20,  发布年份 2012
来源: Springer
PDF
【 摘 要 】

BackgroundIdentification of protein structural cores requires isolation of sets of proteins all sharing a same subset of structural motifs. In the context of an ever growing number of available 3D protein structures, standard and automatic clustering algorithms require adaptations so as to allow for efficient identification of such sets of proteins.ResultsWhen considering a pair of 3D structures, they are stated as similar or not according to the local similarities of their matching substructures in a structural alignment. This binary relation can be represented in a graph of similarities where a node represents a 3D protein structure and an edge states that two 3D protein structures are similar. Therefore, classifying proteins into structural families can be viewed as a graph clustering task. Unfortunately, because such a graph encodes only pairwise similarity information, clustering algorithms may include in the same cluster a subset of 3D structures that do not share a common substructure. In order to overcome this drawback we first define a ternary similarity on a triple of 3D structures as a constraint to be satisfied by the graph of similarities. Such a ternary constraint takes into account similarities between pairwise alignments, so as to ensure that the three involved protein structures do have some common substructure. We propose hereunder a modification algorithm that eliminates edges from the original graph of similarities and gives a reduced graph in which no ternary constraints are violated. Our approach is then first to build a graph of similarities, then to reduce the graph according to the modification algorithm, and finally to apply to the reduced graph a standard graph clustering algorithm. Such method was used for classifying ASTRAL-40 non-redundant protein domains, identifying significant pairwise similarities with Yakusa, a program devised for rapid 3D structure alignments.ConclusionsWe show that filtering similarities prior to standard graph based clustering process by applying ternary similarity constraints i) improves the separation of proteins of different classes and consequently ii) improves the classification quality of standard graph based clustering algorithms according to the reference classification SCOP.

【 授权许可】

CC BY   
© Santini et al.; licensee BioMed Central Ltd. 2012

【 预 览 】
附件列表
Files Size Format View
RO202311094120423ZK.pdf 2038KB PDF download
【 参考文献 】
  • [1]
  • [2]
  • [3]
  • [4]
  • [5]
  • [6]
  • [7]
  • [8]
  • [9]
  • [10]
  • [11]
  • [12]
  • [13]
  • [14]
  • [15]
  • [16]
  • [17]
  • [18]
  • [19]
  • [20]
  • [21]
  • [22]
  • [23]
  • [24]
  • [25]
  • [26]
  文献评价指标  
  下载次数:0次 浏览次数:0次