An updated version of the 3D_ali databank (Pascarella and Argos, 1992) was constructed to incorporate new protein structural and sequence data acquired since the original release in 1992. The databank has proved useful in many research fields, such as protein sequence and structure analysis and comparison, protein folding, engineering and design, evolution, and the like. The collection enhances present protein structural knowledge by merging information from proteins having a similar main-chain fold with homologous primary structures taken from large databases of known sequences. However, the construction philosophy of the databank has been modified. Originally, the Protein Data Bank (PDB; Bernstein et al, 1977) of known 3-D structures was exhaustively scanned for fold redundancy, and all the possible unique structures were incorporated either in multiple structural alignment files or in those containing only one structure where no relatives could be detected. The tertiary structural superpositioning of the backbones, which yielded spatial and topological Co atom equivalencing and thus corresponding sequence alignments, was mostly taken from the literature, but was sometimes determined by the authors using the Rossman-Argos superposition technique (Argos and Rossmann, 1979). In the updated 3D_ali databank, only published alignments based on superpositioning by the authors of the tertiary structures were collected and only folds with more than one sample structure were considered. Different literature alignments were also merged if they included common folds. As in the former release, only full coordinate sets with assigned side chains were included, while NMR structures, excluded in the 1992 release, are now incorporated.
A databank (3D_ALI) collecting related protein sequences and structures / Pascarella, Stefano; Milpeltz, F; Argos, P.. - In: PROTEIN ENGINEERING. - ISSN 0269-2139. - STAMPA. - 9:(1996), pp. 249-251. [10.1093/protein/9.3.249]
A databank (3D_ALI) collecting related protein sequences and structures
PASCARELLA, Stefano;
1996
Abstract
An updated version of the 3D_ali databank (Pascarella and Argos, 1992) was constructed to incorporate new protein structural and sequence data acquired since the original release in 1992. The databank has proved useful in many research fields, such as protein sequence and structure analysis and comparison, protein folding, engineering and design, evolution, and the like. The collection enhances present protein structural knowledge by merging information from proteins having a similar main-chain fold with homologous primary structures taken from large databases of known sequences. However, the construction philosophy of the databank has been modified. Originally, the Protein Data Bank (PDB; Bernstein et al, 1977) of known 3-D structures was exhaustively scanned for fold redundancy, and all the possible unique structures were incorporated either in multiple structural alignment files or in those containing only one structure where no relatives could be detected. The tertiary structural superpositioning of the backbones, which yielded spatial and topological Co atom equivalencing and thus corresponding sequence alignments, was mostly taken from the literature, but was sometimes determined by the authors using the Rossman-Argos superposition technique (Argos and Rossmann, 1979). In the updated 3D_ali databank, only published alignments based on superpositioning by the authors of the tertiary structures were collected and only folds with more than one sample structure were considered. Different literature alignments were also merged if they included common folds. As in the former release, only full coordinate sets with assigned side chains were included, while NMR structures, excluded in the 1992 release, are now incorporated.I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.