SMRUCC.genomics.Data.RCSB.PDB.PDB

PDB {SMRUCC.genomics.Data.RCSB.PDB}

.NET clr documentation

PDB

Description

The RCSB PDB file format is a standardized text-based format used to represent 3D structural data of biological macromolecules, such as proteins, nucleic acids, and viruses. Managed by the Research Collaboratory for Structural Bioinformatics (RCSB), it serves as the primary format for entries in the Protein Data Bank (PDB), a global repository for experimentally determined structures. Below is a detailed introduction:

### Key Features 1. Text-Based Structure: Plain text file (`.pdb` extension) with a fixed-column format, meaning data is organized into specific columns for consistency. Each line begins with a record type (e.g., `ATOM`, `HETATM`, `HEADER`) that defines the data it contains. 2. Core Components: - Atomic Coordinates: Stored in `ATOM` (standard residues) and `HETATM` (heteroatoms, e.g., water, ligands) records. - Metadata: Includes details like the title (`TITLE`), experimental method (`EXPDTA`), authors (`AUTHOR`), and biological source (`SOURCE`). - Sequence Information: Provided in `SEQRES` lines. - Secondary Structure: Annotated in `HELIX`, `SHEET`, and `TURN` records. - Connectivity: Bonds between atoms are listed in `CONECT` lines. - Crystallographic Data: Unit cell parameters (`CRYST1`), symmetry operations, and resolution. 3. Example ATOM/HETATM Line:

    ATOM   2301  CA  SER A 301      26.417  24.105  34.560  1.00 30.97           C  
    HETATM 9101  O   HOH A 910      10.500  20.100  30.500  1.00 25.00           O

- Columns 1-6: Record type (e.g., `ATOM`). - Columns 7-11: Atom serial number. - Columns 13-16: Atom name (e.g., `CA` for alpha carbon). - Columns 17-20: Residue name (e.g., `SER` for serine). - Column 22: Chain identifier (e.g., `A`). - Columns 23-26: Residue number. - Columns 31-54: X, Y, Z coordinates. - Columns 55-60: Occupancy and temperature factor (B-factor). - Columns 77-78: Element symbol (e.g., `C`, `O`).

### Common Record Types

Record	Description

`HEADER`	Molecular type, deposition date, and PDB ID (e.g., `1ABC`).

`TITLE`	Title of the structure.

`COMPND`	Molecular components in the entry (e.g., protein, ligand, ion).

`SEQRES`	Amino acid/nucleotide sequence of the macromolecule.

`ATOM`	3D coordinates of standard residues (e.g., amino acids in a protein).

`HETATM`	Coordinates of heteroatoms (non-standard residues: ligands, water, ions).

`HELIX`	Details of α-helices.

`SHEET`	Details of β-sheets.

`CONECT`	Bonds between atoms not covered by standard residue templates.

`REMARK`	Annotations, experimental details, or warnings.

### Limitations - Column Width Restrictions: Legacy format limits data fields (e.g., residue numbers up to 9999, atom serial numbers up to 99,999). - Sparse Connectivity Data: Bonds are often inferred rather than explicitly listed. - No Support for Large Structures: Superseded by the mmCIF/PDBx format (more flexible, supports larger datasets).

### Modernization: mmCIF/PDBx Format The PDB now prioritizes the mmCIF format (Macromolecular Crystallographic Information File), which uses a flexible, key-value-based structure without column limits. Legacy PDB files are automatically converted to mmCIF for archiving.

### Tools for Viewing/Editing - Visualization: PyMOL, Chimera, VMD, RCSB PDB Viewer. - Analysis: BioPython, MDAnalysis. - Database Access: RCSB PDB website (search, download, and explore entries).

### Example PDB File Snippet

 HEADER    HYDROLASE                             15-JUL-98   1ABC              
 TITLE     CRYSTAL STRUCTURE OF EXAMPLE ENZYME                                 
 COMPND    MOL_ID: 1;                                                           
 COMPND   2 MOLECULE: EXAMPLE ENZYME; CHAIN: A;                                 
 SEQRES   1 A  321  SER GLY LEU ARG TYR ...                                      
 ATOM      1  N   SER A   1      10.000  20.000  30.000  1.00 25.00           N  
 ATOM      2  CA  SER A   1      11.000  21.000  31.000  1.00 26.00           C  
 HETATM 1001  O   HOH A 1001     40.000  50.000  60.000  1.00 30.00           O  
 HELIX    1  ALA A 10 THR A 20  1                                            
 CONECT 1001 1002

### Use Cases - Studying protein-ligand interactions. - Analyzing enzyme active sites. - Visualizing mutations in diseases. - Teaching structural biology concepts. For more details, visit the RCSB PDB and explore entries like 1ATP.

pdb file is the struct data about a protein complex, one pdb file may includes multiple protein and metabolite compound data.

Declare

            
# namespace SMRUCC.genomics.Data.RCSB.PDB
export class PDB {
   ANISOU: ANISOU;
   # Populate out the multiple structure models inside current pdb data file
   AtomStructures: iterates(Atom);
   Author: Author;
   CAVEAT: CAVEAT;
   CISPEP: CISPEP;
   Compound: Compound;
   Conect: CONECT;
   crystal1: CRYST1;
   DbRef: DbReference;
   Experiment: ExperimentData;
   Formula: Formula;
   Header: Header;
   Helix: Helix;
   Het: Het;
   HetName: HetName;
   HETSYN: HETSYN;
   Journal: Journal;
   Keywords: Keywords;
   Links: Link;
   Master: Master;
   Matrix1: MTRIX123;
   Matrix2: MTRIX123;
   Matrix3: MTRIX123;
   MaxSpace: Point3D;
   MDLTYP: MDLTYP;
   MinSpace: Point3D;
   MODRES: MODRES;
   # number of models inside current pdb file
   NUMMDL: NUMMDL;
   Origin1: ORIGX123;
   Origin2: ORIGX123;
   Origin3: ORIGX123;
   Remark: Remark;
   Revisions: Revision;
   Scale1: SCALE123;
   Scale2: SCALE123;
   Scale3: SCALE123;
   seqadv: SEQADV;
   Sequence: Sequence;
   Sheet: Sheet;
   SIGATM: SIGATM;
   SIGUIJ: SIGUIJ;
   Site: Site;
   Source: Source;
   # the input data text of this pdb object
   SourceText: string;
   SPLIT: SPLIT;
   SPRSDE: SPRSDE;
   SSBOND: SSBOND;
   Title: Title;
}

.NET clr type reference tree

use by property member ANISOU: ANISOU
use by property member Author: Author
use by property member CAVEAT: CAVEAT
use by property member CISPEP: CISPEP
use by property member Compound: Compound
use by property member Conect: CONECT
use by property member crystal1: CRYST1
use by property member DbRef: DbReference
use by property member Experiment: ExperimentData
use by property member Formula: Formula
use by property member Header: Header
use by property member Helix: Helix
use by property member Het: Het
use by property member HetName: HetName
use by property member HETSYN: HETSYN
use by property member Journal: Journal
use by property member Keywords: Keywords
use by property member Links: Link
use by property member Master: Master
use by property member Matrix1: MTRIX123
use by property member Matrix2: MTRIX123
use by property member Matrix3: MTRIX123
use by property member MaxSpace: Point3D
use by property member MDLTYP: MDLTYP
use by property member MinSpace: Point3D
use by property member MODRES: MODRES
use by property member NUMMDL: NUMMDL
use by property member Origin1: ORIGX123
use by property member Origin2: ORIGX123
use by property member Origin3: ORIGX123
use by property member Remark: Remark
use by property member Revisions: Revision
use by property member Scale1: SCALE123
use by property member Scale2: SCALE123
use by property member Scale3: SCALE123
use by property member seqadv: SEQADV
use by property member Sequence: Sequence
use by property member Sheet: Sheet
use by property member SIGATM: SIGATM
use by property member SIGUIJ: SIGUIJ
use by property member Site: Site
use by property member Source: Source
use by property member SPLIT: SPLIT
use by property member SPRSDE: SPRSDE
use by property member SSBOND: SSBOND
use by property member Title: Title

[Package {$package} version {$version} Index]