The program Align-GVGD was used to classify as functional or non-functional, all possible missense substitutions in p53.
A-GVGD scores missense substitutions against the range of variation present at their position in a multiple sequence alignment. It has previously been applied to the tumor suppressor protein BRCA1 (Tavtigian, et al., 2005), and allowed the identification of 8 previously unclassified neutral mutants. The program uses multiple sequence alignments (MSA) and the Grantham matrix to determine the conservation of amino-acid residues in a protein. The Grantham matrix provides a measure of the biochemical distances between amino acids, according to their composition, polarity and volume (Grantham, 1974).
In the A-GVGD program, two different types of conservation scores are calculated: (1) Grantham Variation (GV); (2) Grantham Deviation (GD). Conceptually, all amino acids observed at a given position are plotted on a three-dimension graph, with their polarity, volume, and composition as coordinates and with different weights applied on the axes. This cloud of points points formed by the amino acid can be enclosed within a box (GV box), where the coordinates of the diagonal are the minimum and maximum values of C, P, V, for the observed amino acids. GV is computed as the Euclidian length of the main diagonal of the box. GV is thus a measure of the amount of observed biochemical variation in a particular position in the alignment. Next, the GD is calculated by plotting a given mutation on the polarity-volume-composition graph, and measuring the Euclidian distance between that mutation and the closest point on the GV box. If the substitution lies within the box, then GD = 0. Otherwise, GD is greater than 0. The GD is thus a measure of the biochemical difference between the mutant and the observed variation at that position according to the MSA.
To classify p53 missense mutations, the following GV/GD cutoff values were applied
(please note that the AGVGDClass has been corrected as of 8 dec 2005):
If GD = 0 : the composition, polarity and volume of the mutant amino acid fall within the observed range of variation
according to the alignment at that position, so the mutation is predicted as neutral;
Else :
It is important to note that the accuracy of the predictions is highly dependent on the input MSA used to calculate GV and GD.
The classifications presented in the IARC TP53 database are based on an MSA constructed with 3D-Coffee (http://igs-server.cnrs-mrs.fr/Tcoffee/tcoffee_cgi/index.cgi),
using the following 9 sequences: Homo sapiens (sp|P04637), Macaca mulatta (monkey, sp|P56424), Bos taurus (bovine, sp|P67939),
Canis familiaris (dog, sp|Q29537), Mus musculus (mouse, sp|P02340), Rattus norvegicus (rat, sp|P10361),
Gallus gallus (chicken, sp|P10360), Xenopus laevis (frog, tr|P53_XENOPUS), Brachydanio rerio (zebrafish, sp|P79734). Get the FASTA sequences here.
The x-ray solved structure of the DNA binding domain of human p53 (PDB [Berman, et al., 2000] id 1tsr, chain B) was also used to construct the MSA.
Please click here to view the MSA.
References:
- Mathe E, Olivier M, Kato S, Ishioka C, Hainaut P, Tavtigian SV. 2006. Computational approaches for predicting the biological effect of p53 missense mutations:
a comparison of three sequence analysis based methods. NAR, in press