Cell Ranger DNA, printed on 09/27/2020
Genomic heterogeneity is the hallmark of many complex diseases, including cancer, and is characterized by cellular subpopulations evolving distinct genotypes and phenotypes. Traditional bulk methods, such as whole-genome sequencing and microarrays, limit our ability to accurately assess and investigate the clonal makeup in these complex diseases. Single-cell DNA sequencing is a powerful method to resolve genomic heterogeneity by building deep profiles of individual cells that otherwise get masked.
The Chromium Single Cell CNV (scCNV) Solution provides a comprehensive, scalable approach to profile hundreds to thousands of cells in a single library. This page is designed to help you achieve success with your scCNV experiments by overcoming some commonly observed challenges. This page also provides best practice guidance on topics including sample preparation, library construction, sequencing, and data analysis. This page is not intended to replace the User Guide. Rather, it should be viewed as a complementary document.
Your success is important to us. If you have any questions or concerns please email [email protected].
The Chromium Single Cell CNV Solution is designed to align single cell DNA data to a reference genome and detect regional variations in copy number (CNVs), that is, the number of copies of a given genomic region (2 Mb to whole chromosome level) that are present in a sample. See Velazquez-Villarreal et al. (2019) as an example.
At a single cell level, this solution can resolve copy number events as small as 2 Mb and can be improved to up to 200 kb when single cell data are hierarchically clustered together. It is possible to further improve the event resolution with higher sequencing depth.
Calling CNVs as described above should not be confused with calling absolute mean ploidy, or the mean estimated copy number over all primary contigs for a given cell or group of cells (e.g., calling whole genome duplications of 2N, 4N, 8N). The solution is not designed to distinguish cell populations that differ in their absolute mean ploidy (although we do estimate a scale factor for each cell and node in the tree to make CNV calls). For example, a tetraploid (4N) cell in the G1 phase of the cell cycle will have much the same signal as a diploid (2N) cell in the G2 phase. (If most of the genome is the same copy number, we assume it is diploid, although this can be adjusted with the --soft-max-avg-ploidy and --soft-min-avg-ploidy options).
For more details on the algorithms underlying CNV calling and scaling, see the CNV calling algorithms page.
We offer two pre-built human genome references (GRCh38 and GRCh37) and one mouse (GRCm38) reference which are available for download and fully supported.
Yes, we provide a mkref tool in the Cell Ranger DNA pipeline that can be used to customize the human or mouse references, or prepare a reference genome for another species.
Successful experiments with other species depend on several factors including:
We have tested this solution in-house on human and mouse samples. Some of our customers have had success generating scCNV data for invertebrate species, but most of these datasets are not published yet. Check our scCNV publications page for more information.
Multi-species (“barnyard”) references are not supported at this time. However, it is possible to run a xenograft sample against the human reference alone.
The solution requires a fully dissociated, single cell or nuclei suspension. The Single Cell CNV workflow is not currently supported for fixed tissues. Fresh and frozen samples are supported.
We have successfully tested nuclei and cells with a cell diameter of up to 30 microns (μm) in suspension. There is no minimum cell size.
We have tested our solution primarily on cells and nuclei using normal tissues, tumor tissues, and cell lines from human and mouse. Tissues we have analyzed include liver, breast, prostate, kidney, brain, liver, colon, and lung.
For other cell types, modifications to the demonstrated protocol(s) may be required. Areas for optimization and consideration include tissue homogenization, centrifugation speed and time, and filtration steps.
There are a few factors to consider when determining whether to isolate cells or nuclei. These include cell size, whether you have fresh or frozen tissue, and whether you have experience with isolating cells or nuclei from your specific sample type.
Cells with diameter ≥ 30 microns increase the risk of clogging Chip C and should be avoided. For these larger cells, we recommend isolating nuclei.
For frozen tissue, we recommend isolating nuclei directly. High starting sample quality is important. Specifically, how the tissue was dissected and frozen, and how long it has been thawed beforehand, impacts data quality.
A demonstrated protocol specific to the scCNV solution for the isolation of high quality nuclei from snap-frozen tissue is available: Isolation of Nuclei for Single Cell DNA Sequencing.
We have not yet developed any scCNV-specific protocols for preparing cell suspensions, but you can adapt demonstrated protocols from our 3' and 5' Single Cell Gene Expression Solutions: Which 10x Demonstrated Protocols are compatible with the single cell DNA workflow?
While genomic DNA is more robust than messenger RNA, it is important to treat cells gently to minimize cell lysis and loss. DNA from viable and non-viable cells can be profiled, therefore, pay extra attention to time sensitive steps in the demonstrated protocol and user guide. For more information, see Do my cells need to be viable for Single Cell DNA analysis?
Accurate cell counting is critical to your success with this application. Stain cells with trypan blue and count the cell concentration and viability using an automated cell counter. Unlysed, viable cells will not take up the stain while nuclei will stain blue and be counted as dead. Cell suspensions containing very small cells, highly variable cell sizes, or cell aggregation may require alternative counting methods such as a hemocytometer.
For assessing viability of nuclei, we recommend using a microscope. See: How can I assess the quality of my nuclei for single cell ATAC sequencing? (This article was written for our scATAC assay, but also applies to scCNV).
For counting nuclei, ethidium homodimer-1 or other fluorescent dyes may help distinguish nuclei from debris for more accurate quantitation.
You can target up to 5,000 cells in a single chip channel. Higher cell counts increase the probability of observing rare populations, but also increase sequencing costs. Cell Ranger DNA can analyze up to 20,000 cells.
You can expect up to 15% cell recovery efficiency after sequencing. To this end, you can load 1,600 to 31,300 cells for a target of 250 to 5,000 cells recovered. See page 20 of the User Guide (Chromium Single Cell DNA Reagent Kits; CG000153 Rev C) for more details. Identifying the right starting cell count is particularly important for rare or precious cell types.
It is important to follow the User Guide exactly. We recommend watching the How-to-videos before getting started in the lab. Familiarize yourself with Tips and Best Practices and the Troubleshooting section (7.1-7.3) of the User Guide. Follow the step-by-step instructions along with reagent handling preparation. It is important to load your cells into the chip as soon as possible (ideally < 30 minutes) after preparation and counting.
Some of the critical steps are highlighted below:
Step 1.1 C – Preparing Cell Bead Mix. During step iii, ensure Activation Reagent is fully mixed before proceeding, avoid introducing air bubbles. During steps iv-vi: do not mix, avoid touching interface with pipette tip while layering reagents.
Step 1.1 D – Load Row Labeled 1. Slowly pipette cell bead mix until it is fully resuspended (this will take some time). Ensure cell matrix isn’t still clumped on tube bottom. Do not proceed until mix is completely homogenous and free of bubbles. The volume of Cell Bead Mix is limiting, make sure to use the same pipette tips used for Cell Bead mixing for loading row 1 of the chip or you may be short. Avoid introducing air bubbles when mixing and loading chip. Accurate cell bead volume is essential to achieve optimal performance.
Step 1.2 H. Slowly aspirate the remaining Partitioning oil and Cell Beads from the Chip C recovery well using wide bore pipette tips (not cut tips). Slowly dispense onto sidewalls of 8-tube strip. Seal 8-strip tightly. Transfer cell beads immediately after recovery onto a shaking thermomixer set at 21 °C, 1000 rpm, for 16-24 hours. Data quality will be severely affected if Cell Beads are not recovered and placed on shaker within 5 minutes of the Chromium run.
Our assay is compatible with Illumina sequencers. Please see our supported sequencers and requirements here.
Yes, in Cell Ranger DNA 1.1 we updated our mappability threshold and added mappability normalization, in part to improve event contiguity when using 2x50bp sequencing configurations. Consult the release notes for further details. Our recommendation continues to be 2x100bp as mapping rates are generally better with longer reads. Single-end reads are not supported.
10x scCNV libraries are single indexed and thus single indexing is the only supported and recommended method. If you choose to use a dual-index configuration, use the --use-bases-mask and --ignore-dual-index option in cellranger-dna mkfastq. We are unable to guarantee the quality of data from on dual indexed runs at this time and accordingly will not be able to provide any further guidance or troubleshooting support.
The sequencing depth should be informed by the total number of cells you wish to profile and the size of CNV events that you wish to detect at the single cell level. A high cell count allows for the detection of rare populations. The sequencing depth per cell impacts the size of CNVs that can be reliably detected; with a higher sequencing depth, you can detect smaller events. We recommend a sequencing depth in the range of 50K-750K read pairs per cell, which corresponds to single cell CNV event sizes of 13-2 Mb. Refer to the table below:
|Read pairs per cell||Single cell CNV resolution (Mb)|
|50K||13 +/- 4|
|100K||7 +/- 2|
|150K||5 +/- 2|
|300K||2.5 +/- 0.7|
|500K||1.8 +/- 0.5|
|750K||1.4 +/- 0.3|
Yes, we offer Cell Ranger DNA, a set of analysis pipelines that perform sample demultiplexing, barcode processing, read alignment, and copy number estimation on single cells and groups of cells defined by hierarchical clustering. See What is Cell Ranger DNA? for more information.
The new version of Cell Ranger DNA offers additional features:
The software is free. The open source code for cellranger-dna 1.0.0 is available on Github. We are planning to release the source code for Cell Ranger DNA 1.1.0 on Github in the near future, however, it is not available yet. Check back soon. The source code for Loupe scDNA Browser and the format of the .dloupe files is proprietary. The license agreement for our software is displayed when you first download through the Software Downloads page.
See the system requirements page for details on cores, RAM, user limits, and disk space. Cell Ranger DNA is very sensitive to I/O speed so we highly recommend solid-state disks or RAID arrays.
HPC cluster or cloud computing is recommended for large-scale experiments (see below).
Cell Ranger DNA, like all 10x pipelines, requires a Linux operating system and is not offered for Mac or Windows. It may be possible to get around this with a docker or virtual machine, but this is not supported (and the required CPUs, RAM, and disk space exceed the capabilities of many laptops and desktop PCs).
We want to make our pipelines as easy to install as possible. All you need to do is download the tarball, unpack it, add it to your PATH, and you are on your way. Please see our installation instructions here.
There are no dependencies other than Illumina’s bcl2fastq, which is used for demultiplexing only. (If you are starting with FASTQ files from your core, bcl2fastq is not needed).
We do not offer SaaS at this time. We are very interested to hear what your bioinformatics needs are. Send your feature requests to [email protected].
While we have high computing requirements to run Cell Ranger DNA, we have attempted to make the pipeline as easy as possible to install and run, so you may not need a dedicated bioinformatician if you have access to appropriate resources and modest experience with Linux.
Yes, we recommend using our cellranger-dna mkfastq pipeline, which is a thin wrapper around Illumina’s bcl2fastq. Using bcl2fastq directly is an equally valid option. We don’t recommend demultiplexing directly with BaseSpace as the default adapter trimming options can destroy the 10x barcode (first 16 bp of Read 1).
Actual wall times vary widely based on the hardware and data (number of reads, reference genome, I/O hardware performance). See our empirical performance results at the bottom of the system requirements page.
Please note that the Cell Ranger DNA pipeline for scCNV data is more computationally demanding than Cell Ranger for single cell gene expression data. This is largely due to the increased complexity of the biology involved. In our single cell gene expression assays, we are targeting mRNA molecules, resulting in reduced representation libraries (transcriptome only). By contrast, in the scCNV assay, reads are generated genome-wide (the average scDNA library contains 8 million unique molecules per cell, corresponding to roughly 0.27X human genome coverage), so the datasets are inherently much larger.
If you have an LSF or SGE batch-queue system on your cluster, you can set up cluster mode, which allows parallel stages to utilize hundreds or thousands of cores concurrently, dramatically reducing the time required to process the data. We also offer cluster mode templates for Slurm and PBS Pro clusters as part of the Cell Ranger DNA package, although we don’t support these.
The drawback is that setting up and troubleshooting cluster mode can be more challenging since a single pipeline instance (pipestance) is spread out over multiple jobs. We recommend first making sure everything is working in local mode (default) and consult with your cluster administrator before setting up cluster mode. See How do I run and troubleshoot 10x pipelines on a HPC cluster?
See Understanding Output for details on the .dloupe, web_summary.html, CSV, BED, BAM, and H5 files.
We introduced a specific aggr pipeline for this in Cell Ranger DNA v1.1. See Aggregating Multiple GEM groups with cellranger-dna aggr.
Noisy cells are those in which the ploidy_confidence from CNV calling is low or in which the DIMAPD value is an outlier compared to the rest of the sample population. These are often replicating cells in the S stage of the cell cycle, but there can also be technical reasons. See Interpreting Output Metrics.
In v1.1 of Cell Ranger DNA we introduced the cellranger-dna reanalyze pipeline, which you can use to prune your data of unwanted cells and optionally impose constraints on the hierarchical clustering. See Customized Secondary Analysis using cellranger-dna reanalyze.
Single cell SNV calling (not to be confused with CNV calling) is generally not possible due to the low depth of coverage on a per cell basis. However, it is possible to combine reads from cells belonging to the same clone and perform “pseudo-bulk” SNV calling using a third-party tool. In v1.1 of Cell Ranger DNA we introduced the cellranger-dna bamslice pipeline for preparing data for this use case. Note: Third-party SNV callers are not supported by 10x Genomics.
Your success is important to us and we are happy to help. Send your questions, concerns, and feature requests to [email protected].
Check back frequently to our scCNV publications page.
You can find our how-to videos here.