How Proteins Are Made

Every protein in the human body is built according to instructions stored in DNA. The journey from a gene to a functioning protein involves a series of precisely coordinated molecular events occurring inside and outside the cell nucleus. This process — known collectively as gene expression — is one of the most fundamental operations in all of biology.

The Central Dogma

The flow of biological information follows a principle known as the central dogma of molecular biology, first articulated by Francis Crick in 1958. In its simplest form: DNA is transcribed into RNA, and RNA is translated into protein. Each step is carried out by a dedicated molecular machinery, and each introduces opportunities for regulation, error-checking, and modification.

DNA itself is never directly read to make a protein. Instead, it serves as a stable master copy, while a temporary working copy — messenger RNA (mRNA) — is produced and used as the actual template for protein synthesis. This separation protects the genome from damage that could otherwise occur during the intensive process of protein production.

Step 1: Transcription — Reading the Gene

Transcription takes place in the cell nucleus. The enzyme RNA polymerase II binds to a region of DNA upstream of a gene called the promoter. Once bound, it unwinds the double helix and reads one strand of DNA as a template, assembling a complementary strand of mRNA one nucleotide at a time. The mRNA sequence mirrors the coding strand of DNA, with the exception that thymine (T) is replaced by uracil (U).

Transcription proceeds in three stages:

Initiation — Transcription factors recruit RNA polymerase to the promoter and help open the DNA double helix.
Elongation — RNA polymerase moves along the DNA template, synthesising the mRNA strand in the 5′ to 3′ direction at a rate of roughly 20–80 nucleotides per second.
Termination — RNA polymerase reaches a termination signal in the DNA and releases the newly formed mRNA transcript.

Step 2: RNA Processing — Preparing the Message

In eukaryotes such as humans, the freshly transcribed mRNA — called pre-mRNA — is not yet ready for translation. It must be processed before leaving the nucleus:

5′ Capping — A modified guanine nucleotide (the 5′ cap) is added to the start of the mRNA. This protects the transcript from degradation and helps ribosomes recognise it.
Polyadenylation — A string of roughly 200 adenine nucleotides (the poly-A tail) is added to the 3′ end, further stabilising the mRNA and aiding its export from the nucleus.
Splicing — Human genes contain non-coding sequences called introns interspersed among coding sequences called exons. A large molecular complex called the spliceosome removes the introns and joins the exons together. Crucially, different combinations of exons can be joined — a process called alternative splicing — allowing a single gene to encode multiple distinct proteins.

Once processed, the mature mRNA is exported through nuclear pore complexes into the cytoplasm, where translation takes place.

Step 3: Translation — Building the Protein

Translation is the process by which the mRNA sequence is decoded to produce a chain of amino acids. It occurs at ribosomes — large molecular machines made of ribosomal RNA (rRNA) and proteins. Human cells contain millions of ribosomes, found both floating freely in the cytoplasm and attached to the endoplasmic reticulum.

The mRNA is read in triplets called codons, each of which specifies one of the 20 standard amino acids (or a stop signal). The correspondence between codons and amino acids is defined by the genetic code, which is nearly universal across all life on Earth. Each amino acid is delivered to the ribosome by a specific transfer RNA (tRNA) molecule bearing a complementary anticodon sequence.

Translation proceeds in three stages:

Initiation — The small ribosomal subunit binds to the 5′ cap of the mRNA and scans for the start codon (AUG), which codes for methionine. The large subunit then joins, forming a complete ribosome ready to begin synthesis.
Elongation — The ribosome moves along the mRNA codon by codon. At each step, the correct aminoacyl-tRNA docks into the ribosome's A site, and a peptide bond is formed between the incoming amino acid and the growing chain — a reaction catalysed by the ribosome itself (specifically its rRNA component). The ribosome then translocates one codon along the mRNA, and the cycle repeats. A new peptide bond can be formed roughly 15–20 times per second.
Termination — When the ribosome reaches a stop codon (UAA, UAG, or UGA), no tRNA matches it. Instead, release factors enter the ribosome, triggering the completed polypeptide chain to be released.

Step 4: Protein Folding

A newly synthesised polypeptide chain is not yet a functional protein. It must fold into a precise three-dimensional shape — its native conformation — to carry out its role. Folding begins co-translationally, meaning the chain starts to fold as it is still being synthesised.

The folding process is guided by the physical and chemical properties of the amino acid sequence itself, but it is also assisted by proteins called molecular chaperones (such as Hsp70 and Hsp90). Chaperones prevent newly made or partially folded proteins from sticking together inappropriately, and they provide a protected environment in which folding can occur correctly. If a protein fails to fold properly, it is typically targeted for degradation by the cell's quality-control machinery.

Step 5: Post-Translational Modification

After folding, many proteins are further modified to regulate their activity, stability, localisation, or interactions. These chemical changes — collectively called post-translational modifications (PTMs) — vastly expand the functional diversity of the proteome. Common PTMs include:

Phosphorylation — Addition of a phosphate group, typically to serine, threonine, or tyrosine residues, by enzymes called kinases. This is a key switch in cell signalling.
Glycosylation — Attachment of sugar chains, common in proteins destined for the cell surface or secretion. Glycosylation affects protein folding, stability, and cell recognition.
Ubiquitination — Attachment of the small protein ubiquitin, which often tags proteins for degradation by the proteasome — the cell's protein recycling machinery.
Acetylation and methylation — Often occur on histones (proteins that package DNA) and regulate gene expression.
Cleavage — Some proteins are synthesised as inactive precursors (zymogens or proproteins) that are activated by proteolytic cleavage. Insulin, for example, is cleaved from a larger precursor called proinsulin.

Trafficking and Secretion

Proteins are not always used where they are made. Many must be transported to specific locations — the nucleus, mitochondria, cell membrane, or outside the cell entirely. Proteins destined for secretion or insertion into membranes are synthesised on ribosomes attached to the rough endoplasmic reticulum (ER). They enter the ER lumen, where further folding and glycosylation occur, then travel in vesicles to the Golgi apparatus for sorting and further modification, before being dispatched to their final destination.

Proteins destined for the nucleus carry a nuclear localisation signal — a short amino acid sequence recognised by import proteins called importins. Mitochondrial proteins carry analogous targeting sequences that direct them to be imported across the mitochondrial membranes.

Regulation of Protein Production

Cells produce only the proteins they need, at the right times and in the right amounts. Gene expression is regulated at every stage of the process:

Level	Mechanism	Example
Transcriptional	Transcription factors bind DNA to activate or repress genes	p53 activates DNA repair genes in response to damage
Epigenetic	DNA methylation and histone modification alter gene accessibility	Silencing of genes during cell differentiation
Post-transcriptional	MicroRNAs (miRNAs) bind mRNA and block translation or trigger degradation	miR-21 suppresses tumour suppressor genes in some cancers
Translational	RNA-binding proteins or ribosome availability control translation rates	Ferritin mRNA translation blocked when iron is low
Post-translational	Modifications alter activity, stability, or localisation	Phosphorylation activates or deactivates enzymes
Protein degradation	Ubiquitin–proteasome system degrades unwanted proteins	Cyclin degradation drives cell cycle progression

When Things Go Wrong

Errors at any stage of protein production can have serious consequences. Mutations in DNA can alter the amino acid sequence, producing a non-functional or harmful protein — as seen in sickle cell anaemia, where a single nucleotide change causes haemoglobin to polymerise under low-oxygen conditions. Errors in splicing can produce aberrant proteins implicated in cancer. Misfolded proteins that escape quality control and aggregate into insoluble clumps underlie neurodegenerative diseases such as Alzheimer's (amyloid-beta and tau), Parkinson's (alpha-synuclein), and prion diseases.

Understanding precisely how proteins are made has opened vast therapeutic possibilities. Synthetic mRNA — delivered into cells — can instruct them to produce therapeutic proteins directly, a strategy famously employed in the mRNA vaccines developed against SARS-CoV-2. Antisense oligonucleotides can block faulty mRNA before it is translated. Small-molecule drugs can inhibit specific enzymes or block aberrant post-translational modifications. The machinery of protein synthesis is not only fundamental to life — it is one of medicine's most promising frontiers.

This document provides a general scientific overview of protein synthesis for educational purposes.