PDB - Protein Data Bank

Quick Look

Looking for a protein sequence? protein structure? Then the RCSB Protein Data Bank is the website to go to! Almost all proteins that have been sequenced are available at the PDB! As of 22 Feb 2000, 11753 structures have been deposited in the database.

It is the most reknowned protein sequence-structure database available on the net. The Protein Data Bank, located at The Research Collaboratory for Structural Bioinformatics (RCSB), formerly located at Brookhaven National Laboratories, is "the single international repository for the processing and distribution of 3-D macromolecular structure data primarily determined experimentally by X-ray crystallography and NMR."

PDB-ID

Each structure file in the database is given a PDB-ID or a PDB Code. This is a four-character alphanumeric code used for accessing the files. The code is comprised of digit/s (0-9) and uppercase letters (A-Z). The assignment of the PDB-ID's are, however, not in any particular order. The indexers at the Data Bank devised mnemonics so that the files would be easy to remember. Here are some samples:

1MNP - Manganese Peroxidase

2CYP - Cytochrome C Peroxidase

3INS - Insulin

Obviously, looking for a particular protein structure amidst the thousands stored in the database simply by the PDB-ID code is tedious and difficult! However, PDB does have a search engine and an organized record for each file. The format of each structure file provides an easy way for searching.

How are the structure files encoded/stored?

A PDB Structure file is separated into different sections:

Title Section - contains the folowing information:

Primary Structure Section - enumerates the primary sequence or the sequence of residues for each chain of the protein. Also contains the non-standard residues like prosthetic groups, inhibitors, solvents and ions. Additional information include the name and formula of hetero groups in the macromolecule.

Secondary Structure Section - contains the data on the helices, sheets, and turns found in protein. Positions of turns, helices and sheets are provided. These are also named and numbered.

Connectivity Annotation Section - contains information on the existence and location of disulfide bonds and other linkages

Miscellaneous Features Section - describes features such as the active site

Crystallographic and Coordinate Transformation Section - contains the geometry of the crystallographic experiment and the coordinate system transformations

Coordinate Section - gives the atomic coordinates

Connectivity Section - gives information on chemical connectivity or how the atoms are connected to each other. Information here includes hydrogen bonds, salt bridges, and links.

Bookkeeping Section - gives final information about the file itself

Viewing the structure file

Although organized, a PBD file, is difficult to understand due to the length and the amount of data included in the file. Therefore it is not advisable for one to go through the file contents using a simple text viewer. PDB therefore offers a Structure Explorer built within the website itself that displays the content in an easily comprehensible manner. Aside from this, different kinds of software are available to interpret the PDB files. These software include the 3D structure rendering programs such as RasMol, Chime, and Cn3D. The PDB website itself contains links to these various data interpreting softwares.

Website Features

The Structure Explorer built into the PDB website is useful to look at for it provides lots of useful information: