What is single-cell sequencing and why is it important?

What is single-cell sequencing and why is it important?

What is single-cell sequencing and why is it important?

Since the first human genome was sequenced over 20 years ago, sequencing technologies and the sample preparation methods associated with them have evolved rapidly. Early sample prep methods required DNA and/or RNA to be extracted from large numbers of cells from a section of tissue or a cell culture. The sequencing data produced was therefore an average of what was happening in all of the cells in that sample and meant that we couldn’t understand how a single cell behaves.

However, over the last ten years, major developments in sample prep techniques have enabled us to extract and sequence the genetic material of a single cell. Now, single cell sequencing is increasingly becoming an essential part of the biologists’ toolkit. Used by researchers at the Babraham Institute and others worldwide, we will be able to further our understanding of embryo development, immunology and much, much more.

In this blog, we explore:

  • What sequencing is
  • Why single-cell sequencing is so important
  • How single-cell sequencing works

What is sequencing?

Sequencing is a technology that allows us to read the sequence of DNA or RNA. Studying DNA and RNA sequences allows us to understand which sections are needed for a stem cell to become a neuron or for the body to respond to viral infection, for example.

The first sequencing method was developed by Frederick Sanger in the late 1970s. The method was gel based and used DNA polymerase (an enzyme that catalyses the synthesis of DNA) with a mixture of chain terminating nucleotides (ddNTPs) and standard nucleotides (dNTPs). Using a mix of ddNTPs and dNTPs causes DNA synthesis to stop early at random during the sequencing reaction, producing DNA fragments of different lengths. Four reactions are run, one for each of the bases that make up DNA, and the DNA fragments produced can be visualised using gel electrophoresis. Fragments run on a gel are separated by length and this means we can read the DNA sequence base by base.

By the 1980s, Sanger’s original method had been optimised, automated and made commercially available. This allowed scientists to feed a prepared DNA sample into a machine and view the gel results. Sanger sequencing was at the core of the human genome project and is still used today for some experiments. During the 2000s, a lot of effort was made to reduce the cost of sequencing and develop high throughput sequencing methods that would allow scientists to perform large sequencing projects.

Today, there are several different approaches to sequencing and one of the most widely used methods is Illumina sequencing. There are four main stages to the Illumina sequencing workflow; library preparation, cluster generation, sequencing and data analysis:

  1. Library preparation: Library prep converts the RNA or DNA samples being studied into a library that is compatible with the sequencing instrument. If studying RNA, it must first be converted into double-stranded complementary DNA (cDNA). Next, the cDNA or DNA are fragmented before the addition of adapters, short pieces of DNA that are designed to be compatible with the sequencer. Finally, the adapter ligated fragments are amplified using PCR.
  2. Cluster generation: The prepared library is loaded onto a flow cell (image). The surface of the flow cell is then covered by a lawn of primers (short fragments of DNA) that are complementary to the adapters added during library prep. The primer lawn captures the library fragments. Each captured library fragment is amplified by bridge amplification to create “clusters” of a single library fragment. Each cluster contains 1000s of copies of the same library fragment and there are tens of millions of different clusters per flow cell. When cluster generation is finished the library fragments are ready to be sequenced.
  3. Sequencing: Sequencing by synthesis (SBS) uses four fluorescently labelled nucleotides . Using the clustered library fragments as a template, a nucleic acid chain is synthesised using the labelled bases. During a sequencing cycle, a single labelled base is incorporated into the growing nucleic acid chain. The fluorescent label causes synthesis to stop and prevents more than one base being added to the chain during a cycle. After the incorporation of each base, the fluorescent label is imaged to identify which base has been added. The label is then removed by an enzyme so the next base can be added.
  4. Data analysis: The sequences produced from sequencing can be compared against reference genomes (like the human genome) to identify which bits of DNA or RNA were present in the original sample. After this initial alignment, many other data analysis techniques can be applied to answer research questions to better understand phenomena such as antimicrobial resistance and cancer.

Simply, sequencing helps increase our understanding of the world around us and develop things like new therapeutics that can benefit the global community. The next evolution of sequencing – single-cell sequencing – is already unlocking more knowledge that could lead to exciting developments.

Why sequence single cells?

The human body is made up over 200 different cell types that work together in complex biological systems. Almost all the cells in the body share the same DNA but the ‘marks’ that regulate how the DNA is read by the cell and the RNA present will differ between cells in a sample.

“Bulk” sequencing methods (where we extract DNA/RNA from a large number of cells) were a major development in the early 00’s. They are very useful for comparisons between different species, e.g., how similar are the brains of mice and humans, and useful for looking at the levels of RNA in cells from patients with different conditions versus healthy individuals. The data produced by bulk methods is an average of what is happening in all the cells in the sample. This means bulk data is not ideal for looking at systems made up of a lot of different specialised cells, like the brain, or complex systems such as developing embryos.

Single-cell sequencing is a relatively new technology that allows sequencing data to be linked back to an individual cell in a sample. This means we are now able to answer questions where cell specific differences are important. An example of this is research by the Reik group into early embryo development.

Sequencing experiment set up

How do we sequence single cells?

The key to sequencing DNA and RNA from single cells is library preparation methods that have been developed to work with the tiny amounts of DNA and RNA in a single cell – the actual sequencing stage is the same as for bulk samples.

A few strategies for single cell library prep have evolved but they share common features. For example, tissue samples like tumours must be broken up into a single cell suspension so cells can be more easily isolated from one another. The downside of this is we lose information about where the cell was located in the original sample. Another common feature is that the RNA, DNA or cDNA from each cell is barcoded in some way. These barcodes allow us to identify which cell the DNA/RNA/cDNA belongs to. Not at all alike the barcodes you’re used to seeing in a supermarket, these barcodes are short bits of DNA with a known sequence that can be used during data analysis to group all the sequences from one cell together. Frequently used methods include:

  • Isolating cells in 96-well plates - individual cells can be dispensed into wells of a 96-well plate using flow cytometry or automated methods like the Fluidigm C1. Methods to amplify the whole transcriptome (RNA) or genome (DNA) can then be used to generate enough cDNA/DNA to prepare libraries from, and the libraries are prepared in the wells of the plate. barcodes are added during library prep and you generate a library per cell.
  • Isolating cells in microdroplets or microwells – cells can be partitioned alongside barcoded gel beads in nanolitre scale oil droplets (10x Genomics) or in microwell chips (BD Rhapsody/Takara iCELL8). Barcodes supplied by the gel beads are added to the cDNA/DNA in the droplets or microwells. The cDNA/DNA from all of the cells is then pooled and a library is prepared from this pool. During data analysis the barcodes are used to tell which DNA or RNA came from which cells.

There are pros and cons of both approaches but having different techniques available allows researchers to pick the one most suited to answering their question.

The power of single-cell sequencing

Single cell sequencing is allowing scientists to answer questions where the differences between individual cells in a system are vitally important. It is helping scientists in the Reik group at the Institute to investigate how the cells of an embryo divide and become the 200 different cell types in the human body, and also scientists in the Liston and Turner labs to name a few. Single cell sequencing is also at the core of the Human Cell Atlas, an international effort to create a comprehensive map of all the cells in the human body. By understanding the location and function of all cells in the body, we will be able to unlock key insight into human health that will enable scientists to develop new preventatives and therapeutics.

Sequencing at the Babraham Institute

At the Institute, we have eight cutting-edge facilities, including one dedicated to sequencing. Our Genomics facility offers single cell sequencing using 10x Genomics Chromium and SMART-seq v4 on Illumina sequencers. We also support Babraham Institute researchers with their custom sequencing projects. Our facility is also open for external users on a fee-for-service basis. Find out more about the expertise and capabilities of our Genomics facility in the video below: