๐Ÿงฌ Biotech Institute
Educational Resources

Molecular Biology

The molecular machinery of life. How DNA stores information, RNA carries messages, and proteins do the work โ€” from replication to translation.

DNA Structure

Deoxyribonucleic acid (DNA) is the molecule that stores genetic information in all cellular life and many viruses. Its structure was determined by James Watson and Francis Crick in 1953, building on X-ray crystallography data from Rosalind Franklin and Maurice Wilkins.

The Double Helix

🧬
Double Helix
Antiparallel strands
A-T (2 H-bonds), G-C (3 H-bonds)
📜
Base Pairing
5' → 3' coding strand
3' → 5' template strand

Genome Organization

🧬 DNA is like a recipe book for building YOU. It has 3 billion letters that spell out instructions for everything โ€” your eye color, how tall you are, even how your brain works! The letters are A, T, G, and C, and they always pair up: A with T, G with C. It's shaped like a twisted ladder called a double helix.

RNA

Ribonucleic acid (RNA) is chemically similar to DNA but differs in three key ways: it uses ribose sugar (with a 2'-OH group) instead of deoxyribose, it contains uracil (U) instead of thymine (T), and it is usually single-stranded. RNA is far more versatile than initially appreciated.

Types of RNA

The Central Dogma

Proposed by Francis Crick in 1958 and published in 1970. The central dogma describes the flow of genetic information: DNA is replicated, DNA is transcribed into RNA, and RNA is translated into protein. Information flows from nucleic acid to protein, but not from protein back to nucleic acid.

DNA
Storage
Transcription
RNA
Messenger
Translation
Protein
Function
Replication ← DNA → Reverse transcription (retroviruses)

The Flow

Exceptions and Extensions

DNA Replication

DNA replication is the process of copying the entire genome before cell division. In E. coli, replication proceeds at ~1,000 nucleotides/second with an error rate of ~1 per 10^9 bases (after proofreading and mismatch repair). Human cells replicate 6.4 billion base pairs in ~8 hours using ~30,000 replication origins.

Key Steps

Telomeres

Linear chromosomes have an "end-replication problem": the lagging strand cannot be fully replicated at the chromosome ends. Telomeres โ€” repetitive sequences (TTAGGG in humans, repeated 1,000-2,000 times) โ€” protect chromosome ends from erosion. Telomerase (a reverse transcriptase with an RNA template) extends telomeres in stem cells and germ cells. In somatic cells, telomeres shorten with each division (~50-100 bp per division), contributing to cellular aging (Hayflick limit). Cancer cells typically reactivate telomerase for unlimited replication.

Transcription

Transcription is the synthesis of RNA from a DNA template by RNA polymerase. The enzyme reads the template strand 3' to 5' and synthesizes the RNA transcript 5' to 3'. Unlike DNA replication, transcription does not require a primer.

In Prokaryotes

In Eukaryotes

Translation

Translation is the process of synthesizing a polypeptide chain from an mRNA template. It occurs on ribosomes โ€” large ribonucleoprotein complexes (2.5-4 MDa) composed of two subunits. In E. coli, translation elongation proceeds at ~20 amino acids per second.

The Ribosome

Steps

Post-Translational Modifications

The Genetic Code

The genetic code maps 64 codons (4^3 triplets) to 20 amino acids and 3 stop signals. It was deciphered between 1961 and 1966 by Marshall Nirenberg, Har Gobind Khorana, and Robert Holley (Nobel Prize 1968). The code is nearly universal across all life โ€” from bacteria to humans โ€” with minor variations in mitochondria and some organisms.

Properties

Codon Usage Bias

Although the code is degenerate, organisms preferentially use certain codons over synonymous alternatives. E. coli strongly favors certain codons (e.g., CGU for arginine over CGG), matching the abundance of corresponding tRNAs. Codon optimization โ€” adjusting a gene's codons to match the host's preference โ€” is essential for heterologous protein expression. The BioNTech/Pfizer COVID-19 vaccine used codon-optimized mRNA with N1-methylpseudouridine to enhance expression and reduce innate immune activation.

Proteins

Proteins are the functional workhorses of the cell. Built from 20 standard amino acids linked by peptide bonds, they fold into specific 3D structures that determine their function. The human body contains an estimated 80,000-400,000 distinct proteins.

Amino Acid Properties

Protein Functions

⚙️ Proteins are tiny machines inside your body. DNA is the blueprint, but proteins do the actual work โ€” they digest food, fight germs, carry oxygen, and build muscles. Your body has over 80,000 different kinds of protein machines, each folded into a special shape that lets it do its job!

Resources

Molecular Biology of the Cell (Alberts et al.)

The gold-standard textbook. NCBI Bookshelf has older editions free online. Covers everything from DNA to cellular organization.

NCBI Bookshelf | Free (older editions)

MIT 7.013: Introductory Biology

MIT OpenCourseWare. Covers molecular biology, genetics, gene regulation, and biotechnology. Full lecture videos and problem sets.

MIT OCW | Free

Khan Academy: AP Biology

Clear, visual explanations of DNA, RNA, protein synthesis, gene regulation, and cell biology. Excellent for beginners.

Khan Academy | Free

RCSB Protein Data Bank

Repository of 3D structures of proteins, nucleic acids, and complexes. Over 220,000 structures. Free to search, download, and visualize.

Database | Free

GenBank (NCBI)

NIH genetic sequence database. All publicly available DNA/RNA sequences. ~250 million sequences. Foundation of bioinformatics research.

NCBI | Free

Nature Scitable: Molecular Biology

Peer-reviewed educational articles from Nature. DNA structure, gene expression, mutation, and regulation. Written for students by experts.

Nature Education | Free