๐Ÿงฌ Biotech Institute
Educational Resources

Protein Structure

From amino acid chains to the three-dimensional machines of life. How proteins fold, how we determine and predict their structures, and why structure matters for drug design.

Primary Structure

The primary structure is the linear sequence of amino acids in a polypeptide chain. It is encoded by the gene and determines all higher levels of structure. The sequence is written from the N-terminus (free amino group) to the C-terminus (free carboxyl group), matching the direction of translation.

Amino Acids

Sequence Analysis

📏 Think of a protein like a really long LEGO chain. The primary structure is just the order of the LEGO bricks. But here's the cool part โ€” once you connect them all in the right order, the chain automatically folds up into an amazing 3D shape, like origami! The shape is what gives the protein its superpowers.

Secondary Structure

Secondary structure refers to local, regular folding patterns stabilized by hydrogen bonds between backbone atoms (C=O and N-H groups). Linus Pauling and Robert Corey predicted the two major secondary structures in 1951, before the first protein crystal structure was solved.

Primary — Amino acid sequence (1D chain)
Secondary — α-helices and β-sheets (local folding)
Tertiary — Full 3D fold of one chain
Quaternary — Multi-subunit assembly

Alpha Helix

Beta Sheet

Other Elements

Prediction

Secondary structure can be predicted from sequence with ~80-85% accuracy. Methods: PSIPRED (neural network using PSI-BLAST profiles), JPred, DSSP (dictionary of secondary structure, assigns SS from 3D coordinates). GOR, Chou-Fasman (early methods, lower accuracy). Modern methods use deep learning on multiple sequence alignments.

Tertiary Structure

The tertiary structure is the complete 3D arrangement of all atoms in a single polypeptide chain. It is the biologically active conformation, determined by interactions between side chains and the backbone, with the solvent (usually water).

Hydrophobic
Dominant force
Nonpolar core
H-bonds
Backbone + side chain
Secondary structure
Disulfide
Covalent S-S
Extracellular

Stabilizing Forces

Structural Motifs

Quaternary Structure

Quaternary structure describes the arrangement of multiple polypeptide chains (subunits) into a multi-subunit complex. Not all proteins have quaternary structure โ€” it applies only to multi-chain assemblies. Subunits are held together by the same non-covalent forces as tertiary structure, plus occasional inter-chain disulfide bonds.

Examples

Symmetry

Protein Folding

How a linear polypeptide chain reaches its native 3D structure is one of the grand challenges in biology. Levinthal's paradox (1969): a 100-residue protein has ~3^100 possible conformations. If it sampled one per picosecond, it would take longer than the age of the universe. Yet most proteins fold in milliseconds to seconds. The search is not random โ€” folding follows an energy funnel.

The Folding Funnel

Chaperones

Misfolding and Disease

AlphaFold

AlphaFold, developed by DeepMind (Google), is an AI system that predicts protein 3D structures from amino acid sequences with near-experimental accuracy. It solved the "protein structure prediction problem" that had been an open challenge for 50 years.

AlphaFold2 (2020)

AlphaFold Protein Structure Database

AlphaFold3 (2024)

Limitations

🤖 AlphaFold is like an AI that solves puzzles. Scientists spent 50 years trying to figure out how proteins fold into their 3D shapes. Then Google's AI cracked it! Now it has predicted the shape of almost every protein known to science โ€” over 200 million of them. The scientists who built it won the Nobel Prize!

The Protein Data Bank (PDB)

The PDB (rcsb.org) is the global repository for experimentally determined 3D structures of biological macromolecules. Established in 1971 at Brookhaven National Laboratory with 7 structures. As of 2025, it contains over 220,000 structures.

Experimental Methods

Using the PDB

Drug Targets

Protein structure is central to modern drug discovery. Understanding the 3D shape of a target protein โ€” especially its binding sites โ€” enables rational drug design. Approximately 60% of approved drugs target proteins (mostly enzymes, receptors, ion channels, and transporters).

Structure-Based Drug Design

Notable Drug-Target Successes

Emerging Approaches

Resources

RCSB Protein Data Bank

220,000+ experimentally determined structures. Free search, download, visualization with Mol* viewer. The foundation of structural biology.

RCSB | Free

AlphaFold Protein Structure Database

200+ million predicted protein structures from DeepMind and EMBL-EBI. Per-residue confidence scores. Transformative resource for all biology.

DeepMind / EMBL-EBI | Free

PyMOL

The most widely used molecular visualization tool. Publication-quality images, structural analysis, movie-making. Open-source version available.

Schrodinger | Free (open-source)

UCSF ChimeraX

Next-generation molecular visualization. Excellent for cryo-EM maps, AlphaFold structures, and large complexes. Free for academic use.

UCSF | Free (academic)

PDB-101

Educational resource from the PDB. "Molecule of the Month" articles by David Goodsell. Beautiful illustrations and clear explanations of protein structure and function.

RCSB | Free

Foldit

Citizen science protein folding game. Solve protein structure puzzles using human spatial reasoning. Contributions have led to real scientific discoveries.

UW | Free game