Zur Kurzanzeige

dc.creatorZintzaras, E.en
dc.date.accessioned2015-11-23T10:55:05Z
dc.date.available2015-11-23T10:55:05Z
dc.date.issued2008
dc.identifier10.1016/j.compbiomed.2008.01.006
dc.identifier.issn0010-4825
dc.identifier.urihttp://hdl.handle.net/11615/34923
dc.description.abstractA methodology for testing the correlation between the sequence and structure distances of proteins is proposed. Structure distances were derived by applying a forward growing classification tree algorithm on defined physico-chemical and geometrical properties of the structures. The structure distance for every pair of proteins was defined as the number of intermediate nodes in the tree. Sequence distances were derived using pairwise sequence alignment. Then, correlation between sequence distance matrix and sequence distance matrix was tested using a Monte Carlo permutation test. The results were compared to those when the double dynamic structure alignment method (SSAP) was applied. The methodology was applied to a data set of 74 proteins belonging to 14 families. The classification tree was able to identify the protein families (the misclassification rate was R = 1.4%) and a 74 x 74 structure distance matrix was produced. For every pair of protein sequences a dissimilarity score was recorded and a sequence distance matrix was produced. The Monte Carlo permutation produced a correlation coefficient r=0.403 (P < 0.001). The SSAP method produced similar results. The proposed methodology may assist in assessing whether protein sequence distances call be predictors of protein Structure distances. (c) 2008 Elsevier Ltd. All rights reserved.en
dc.source.uri<Go to ISI>://WOS:000255450600007
dc.subjectcorrelationen
dc.subjectclassification treeen
dc.subjectprotein sequenceen
dc.subjectprotein structureen
dc.subjectdistance matrixen
dc.subjectpredictionen
dc.subjectMonte Carloen
dc.subjectpermutation testen
dc.subjectSTRUCTURE ALIGNMENTen
dc.subjectSTRUCTURE PREDICTIONen
dc.subjectSAMPLE-SIZEen
dc.subjectDATA-BANKen
dc.subjectALGORITHMSen
dc.subjectBiologyen
dc.subjectComputer Science, Interdisciplinary Applicationsen
dc.subjectEngineering,en
dc.subjectBiomedicalen
dc.subjectMathematical & Computational Biologyen
dc.titleClassification tree based protein structure distances for testing sequence-structure correlationen
dc.typejournalArticleen


Dateien zu dieser Ressource

DateienGrößeFormatAnzeige

Zu diesem Dokument gibt es keine Dateien.

Das Dokument erscheint in:

Zur Kurzanzeige