Abstract
We derive new quantitative descriptors for the 20 naturally occurring amino acids based on multidimensional scaling of 237 physical-chemical properties. We show that a five-dimensional property space can be constructed such that the amino acids are in a similar spatial distribution to that in the original high-dimensional property space. Properties that correlate well with the five major components were hydrophobicity, size, preferences for amino acids to occur in α-helices, number of degenerate triplet codons and the frequency of occurrence of amino acid residues in β-strands. Distances computed for pairs of amino acids in the five-dimensional property space are highly correlated with corresponding scores from similarity matrices derived from sequence and 3D structure comparison. We used the five-dimensional property distances to cluster the amino acids in groups depending on a cutoff distance. These groups define a reduced amino acid alphabet for protein folding studies. Our descriptors should provide a quantitative means to identify property motifs in sequences of protein families. Electronic supplementary material to this paper can be obtained by using the Springer Link server located at http://dx.doi.org/10.1007/s00894-001-0058-5.
Original language | English (US) |
---|---|
Pages (from-to) | 445-453 |
Number of pages | 9 |
Journal | Journal of Molecular Modeling |
Volume | 7 |
Issue number | 12 |
DOIs | |
State | Published - 2001 |
Externally published | Yes |
Keywords
- Amino acid
- BLOSUM
- Cluster analysis
- Multidimensional scaling
- PAM
- Physical-chemical properties
- Substitution matrices
ASJC Scopus subject areas
- Catalysis
- Computer Science Applications
- Physical and Theoretical Chemistry
- Organic Chemistry
- Inorganic Chemistry
- Computational Theory and Mathematics