Novel Integrated Approach to Structuring Databases of Secondary and
3-D Protein Structures: Application to Immunoglobulin Structure Analysis
and Prediction
|
Israel Gelfand, Alexander Kister
A highly novel approach based on the extraction of significant features of
protein structure has resulted in a striking finding: the derivation of an
invariant system of coordinates that permits comparison
both within and between the data characterizing variable domains in the
proteins. These results have followed from the analysis of 5000 Ig
sequences (from the Kabat database) from which the secondary structures were
acurately predicted. The initial structural analysis of the Ig database
has been published in the Proceedings of the National Academy (Gelfand and
Kister, November 1995), and the results for the invariant coordinate system
has been submitted for publication in December 1995.
We are proposing to extend the analytical approach we have developed to
systematize it as a means of determining relations between sequence, secondary,
and three-dimensional structures for proteins in general.
The major elements of the novel approach are as follows:
- A new database organization of secondary structures for Ig molecules was
created, where every residue was assigned a position in a strand or loop
and a characteristic position derived.
- A statistical analysis of the residues was developed based on a novel
decomposition in terms of structural units.
- Examining 60 three dimensional crystalographic structures of Ig molecules
permited us to define the different structural roles of the residues in the
positions for folding of the Ig chain. Using this classification we
identified the position in the database [PNAS November paper].
Subsequent to the results noted above we investigated a new approach for
comparing protein structures, since the above did not yield an optimal
superposition of structure based on inherent geometrical properties of the
molecules.
The result of our most recent research, as mentioned at the begining of this
proposal, is an invariant system of coordinates for Ig molecules. Comparison of
3 D structures of the mouse kappa molecules allows us to find the common
features of protein folding and to determine the conservative C-alpha
framework for the proteins.
When generalized to other protein structures, the implications of this novel
method of integrated analysis can be extremely significant. It will allow
correlations between sequence, secondary and 3-D strucutre which do not appear
obvious otherwise, and can lead to important insights into protein function,
opening many new possibilities for therapy and drug design.