DNA Sequencing: Sanger sequencing method

Introduction to Sanger sequencing

Sanger sequencing, developed by Frederick Sanger and colleagues in 1977, is a pioneering method for DNA sequencing. This technique relies on electrophoresis and involves the random incorporation of chain-terminating dideoxynucleotides by DNA polymerase during in vitro DNA replication. Initially, it was the predominant sequencing method for four decades, first commercialized by Applied Biosystems in 1986.

The first direct DNA sequencing method
Sequencing by synthesis
Di-deoxy termination method
DNA polymerase + Pool of unmodified dNTPs + Pool of di-deoxy NTPs

Methodology

It uses chain-termination method which requires a single-stranded DNA template, a DNA primer, a DNA polymerase, normal deoxynucleotide triphosphates (dNTPs), and modified di-deoxynucleotide triphosphates (ddNTPs), the latter of which terminate DNA strand elongation. These chain-terminating nucleotides lack a 3′-OH group required for the formation of a phosphodiester bond between two nucleotides, causing DNA polymerase to cease extension of DNA when a modified ddNTP is incorporated. The ddNTPs may be radioactively or fluorescently labelled for detection in automated sequencing machines.

What is the difference between a deoxyribonucleotide and a dideoxyribonucleotide? Why dideoxyribonucleotide is used in Sanger’s method of DNA sequencing? What will happen if very high or very low amount of ddNTPs are used in Sanger’s method of DNA sequencing? Discuss.

The key difference between a deoxyribonucleotide and a dideoxyribonucleotide lies in their chemical structure. A deoxyribonucleotide contains a deoxyribose sugar molecule, a nitrogenous base (adenine, guanine, cytosine, or thymine), and a phosphate group. On the other hand, a dideoxyribonucleotide lacks the 3′ hydroxyl group on the deoxyribose sugar, making it incapable of forming further phosphodiester bonds.

Dideoxyribonucleotides are used in Sanger’s method of DNA sequencing as chain terminators. When a dideoxyribonucleotide is incorporated into a growing DNA strand during the sequencing reaction, it prevents further elongation of that strand, resulting in DNA fragments of varying lengths.

Property	Deoxyribonucleotide	Dideoxyribonucleotide
3′ Hydroxyl Group	Present	Absent
Forms Phosphodiester Bonds	Yes	No
Function in Sanger Sequencing	Allows strand elongation	Terminates strand elongation

If very high amounts of ddNTPs (dideoxyribonucleotides) are used in Sanger’s method:

More DNA strands will terminate prematurely
Resulting in shorter fragment lengths
Potentially leading to incomplete or inaccurate sequence information

If very low amounts of ddNTPs are used:

Fewer DNA strands will terminate
Resulting in longer fragment lengths
Potentially making it difficult to resolve and distinguish individual fragments
Leading to incomplete or inaccurate sequence information

The Sanger (chain-termination) method for DNA sequencing.

By Estevezj – Own work, CC BY-SA 3.0, https://commons.wikimedia.org/w/index.php?curid=23264166

The DNA sample is divided into four separate sequencing reactions, containing all four of the standard deoxynucleotides (dATP, dGTP, dCTP and dTTP) and the DNA polymerase.
To each reaction is added only one of the four dideoxynucleotides (ddATP, ddGTP, ddCTP, or ddTTP), while the other added nucleotides are ordinary ones.
The deoxynucleotide concentration should be approximately 100-fold higher than that of the corresponding dideoxynucleotide (e.g. 0.5mM dTTP: 0.005mM ddTTP) to allow enough fragments to be produced while still transcribing the complete sequence
Following rounds of template DNA extension from the bound primer, the resulting DNA fragments are heat denatured and separated by size using gel electrophoresis.
The DNA bands can be visualized by autoradiography or UV light, and the DNA sequence can be directly read off the X-ray film or gel image.

DNA Sequence Analysis by Autoradiograph

After the completion of each reaction, Polyacrylamide gel electrophoresis (PAGE) is performed. Each reaction mix is loaded in a separate lane. The reaction condition should be carefully controlled to separate the strands that differ just by a single nucleotide. PAGE is done in denaturating condition in presence of urea or less frequent formamide. Urea and formamide lowers the melting point of DNA molecule, denatures DNA by disrupting the H bond and the newly synthesized strand separates from the template strand.

Electrophoresis is carried out at high voltage to prevent the renaturation of DNA due to high heat generation in gel.

After complete run, the gel is transferred on nitrocellulose filter and autoradiography is performed so the only bands having the 5’ radiolabelled molecule will be visible as bands. In PAGE the shortest fragment moves faster so the bottom most molecule is the first dideoxynucleotide which stopped the chain elongation by its incorporation and thus that should be the first sequenced nucleotide. So by this bottom up approach in all the lanes we can get a combined DNA sequence of the query (Fig .4).

A schematic diagram of Sanger DNA sequencing method with an autoradiograph. The sequence is inthe 5’ to 3’ direction and is the complementary of query.

Applications of Sanger Sequencing

Sanger sequencing is still widely used for smaller-scale projects, where its accuracy and longer read lengths are beneficial.
It is employed for the validation of results obtained from deep sequencing technologies.
Active use in public health initiatives, such as sequencing the spike protein from SARS-CoV-2.
Utilized by the CDC’s CaliciNet surveillance network for monitoring norovirus outbreaks.

Key differences between Sanger sequencing and NGS

Feature	Sanger Sequencing	Next-Generation Sequencing (NGS)
Principle	Chain-termination method	Parallel sequencing of millions of DNA fragments
Development	Introduced in 1977 by Frederick Sanger	Emerged in the mid-2000s
Read length	> 500 nucleotides (long reads)	Short reads (100s of base pairs), with some platforms achieving longer reads
Accuracy	Very high (around 99.99%)	High, but may vary between platforms
Speed	Slower	Faster, capable of high-throughput sequencing
Cost per Bae	Higher cost per base	Lower cost per base
Applications	Often used for smaller-scale projects, validation of results	Commonly used for large-scale genome analysis, whole-genome sequencing, transcriptomics, etc.
Instrumentation	Requires capillary electrophoresis machines	Requires specialized NGS platforms
Labour intenstity	Labor-intensive, manual processes	Less labor-intensive due to automation
Error rate	Low error rate (high accuracy)	Low, but may vary by platform and read length
Use in public health	Active role in initiatives like sequencing the spike protein of SARS-CoV-2, surveillance of norovirus outbreaks through CDC’s CaliciNet	Commonly used in various public health initiatives due to high-throughput capabilities and rapid results

Human Genome Sequencing Project

Too expensive and time consuming!
Completed in 2003, took ~13 years
Cost: USD 3 billion

Drawbacks of Sanger sequencing method

Read Length Limitation:
Sanger sequencing is limited in its ability to generate long reads compared to some modern sequencing technologies. Read lengths are typically limited to around 500-800 nucleotides.
Low Throughput
Sanger sequencing is a relatively low-throughput method. It involves separate reactions for each DNA fragment, making it less suitable for large-scale or high-throughput sequencing projects.
Labor-Intensive
The process of Sanger sequencing involves manual steps, such as gel preparation, loading, and analysis. This makes it more labor-intensive compared to automated, high-throughput methods like NGS.

Cost-Per-Base
- Sanger sequencing can be more expensive on a per-base basis, especially for longer sequences. The cost per base can become a limiting factor for large-scale sequencing projects.
Limited Multiplexing
- Multiplexing, the simultaneous sequencing of multiple samples in a single run, is limited in Sanger sequencing. NGS technologies offer much higher multiplexing capabilities.
Inability to Discriminate Homopolymeric Regions
- Sanger sequencing may have challenges in accurately determining the length of homopolymeric regions (repeating nucleotides) due to the difficulty in resolving them on the sequencing gel.
Not Suitable for Metagenomics
- Sanger sequencing is less suitable for metagenomic studies, where the goal is to sequence genetic material from multiple species within a complex sample. NGS is more adept at handling such diversity.
Slower Turnaround Time
- Sanger sequencing, with its manual steps and longer individual reaction times, usually has a slower turnaround time compared to NGS methods.
Limited Dynamic Range
- Sanger sequencing may face challenges in accurately quantifying variations in DNA abundance over a wide dynamic range, especially in complex samples.

While Sanger sequencing continues to be valuable for certain applications, especially those requiring high accuracy for short sequences, its limitations have led to the widespread adoption of NGS technologies for large-scale genomics projects.

Emergence of Next Generation Sequencing (NGS) Technologies

Needed:

Direct sequencing method
high-throughput, accurate and reproducible
cost-effective

Target: Human genome sequencing for $1000

Benefits of NGS Technologies

Ability to sequence thousands of genes or genomic regions simultaneously
Ability to directly sequence unknown genomic fragments or genomes
Capability to sequence a large number of samples in a short time
More power to detect low frequency variants
Cost-effective for processing a large number of samples

DNA Sequencing: Sanger sequencing method

Introduction to Sanger sequencing

Methodology