10x Genomics
Chromium Single Cell CNV

Cell Ranger DNA1.1, printed on 04/02/2025

Customized Secondary Analysis using cellranger-dna reanalyze

Analysis software for the 10x Genomics single cell DNA product is no longer supported. Raw data processing pipelines and visualization tools are available for download and can be used for analyzing legacy data from 10x Genomics kits in accordance with our end user licensing agreement without support.

The cellranger-dna reanalyze command re-runs copy number variation analysis on the read counts per bin per cell matrix, optionally with different parameters, a subset of cells, or structured by a user-provided Newick-formatted tree.

cellranger-dna reanalyze only works with cnv_data.h5 files generated by Cell Ranger DNA v1.1.

Command Line Interface

These are the most common command line arguments (run cellranger-dna reanalyze --help for a full list):

Argument	Description
`--id=ID`	A unique run ID string: e.g. `AGG123_reanalysis`
`--cnv-data=H5`	Path of `cnv_data.h5` from a previous `cellranger-dna` invocation (`cellranger-dna cnv`, `cellranger-dna reanalyze`, or `cellranger-dna aggr`).
`--reference=PATH`	Path to a Cell Ranger DNA reference.
`--csv=CSV`	Path of CSV file containing barcode subset definitions (see Configuration).
`--description=TEXT`	(optional) More detailed sample description.
`--tree=NEWICK`	(optional) Path to a Newick format tree file that defines the new tree structure. If this flag is not set, the data will be clustered as-is. Each leaf of this tree must correspond to a row in the CSV passed to --csv.
`--soft-min-avg-ploidy=FLOAT`	(optional) Use a known lower limit on the average ploidy of the sample.
`--soft-max-avg-ploidy=FLOAT`	(optional) Use a known upper limit on the average ploidy of the sample.

After specifying these input arguments, run cellranger-dna reanalyze. In this example, we're reanalyzing the results of an aggregation named AGG123:

$ cd /home/jdoe/runs
$ ls -1 AGG123/outs/*.h5 # verify the input file exists
AGG123/outs/cnv_data.h5
$ cellranger-dna reanalyze --id=AGG123_reanalysis \
                       --cnv-data=AGG123/outs/cnv_data.h5 \
                       --csv=AGG123_reanalysis.csv \
                       --reference=/home/jdoe/refs/GRCh37

The pipeline will begin to run, creating a new folder named with the specified reanalysis ID (e.g. /home/jdoe/runs/AGG123_reanalysis). If this folder already exists, cellranger-dna will assume it is an existing pipestance and attempt to resume running it.

Pipeline Outputs

A successful run should conclude with a message similar to this:

2019-05-06 21:40:29 [runtime] (run:local)       ID.AGG123_reanalysis.CNV_REANALYZER_CS.DLOUPE_PREPROCESS.fork0.join
2019-05-06 21:40:31 [runtime] (chunks_complete) ID.AGG123_reanalysis.CNV_REANALYZER_CS._POSTPROCESSING.MAKE_WEBSUMMARY
2019-05-06 21:40:37 [runtime] (join_complete)   ID.AGG123_reanalysis.CNV_REANALYZER_CS.DLOUPE_PREPROCESS
 
Outputs:
- Run alerts:                       /home/jdoe/runs/AGG123_reanalysis/outs/alarms_summary.txt
- HDF5 file with CNV data:          /home/jdoe/runs/AGG123_reanalysis/outs/cnv_data.h5
- Loupe visualization file:         /home/jdoe/runs/AGG123_reanalysis/outs/dloupe.dloupe
- CNV calls with imputation:        /home/jdoe/runs/AGG123_reanalysis/outs/node_cnv_calls.bed
- CNV calls without imputation:     /home/jdoe/runs/AGG123_reanalysis/outs/node_unmerged_cnv_calls.bed
- Per-cell summary metrics:         /home/jdoe/runs/AGG123_reanalysis/outs/per_cell_summary_metrics.csv
- Reanalyze specification:          /home/jdoe/runs/AGG123_reanalysis/outs/reanalyze.csv
- Analysis summary metrics:         /home/jdoe/runs/AGG123_reanalysis/outs/summary.csv
- Newick guide-tree for clustering: null
 
Pipestance completed successfully!

Refer to the Analysis page for an explanation of the output.

Configuration

Selecting Cells Using a List of Cell Barcodes

You may select your barcodes of interest for each group directly, using a separate barcodes file for each group.

A text editor or Excel may be used to construct the configuration CSV. Your spreadsheet may look something like this:

	A	B
1	library_id	barcodes_csv
2	normal	/home/jdoe/normal_barcodes.csv
3	tumor_primary	/home/jdoe/tumor_primary_barcodes.csv
4	tumor_metastases	/home/jdoe/tumor_metastases_barcodes.csv

When you save this to CSV, it will look something like this:

library_id,barcodes_csv
normal,/home/jdoe/normal_barcodes.csv
tumor_primary,/home/jdoe/tumor_primary_barcodes.csv
tumor_metastases,/home/jdoe/tumor_metastases_barcodes.csv

The barcodes CSV files will each have one barcode entry per line, including the GEM well suffix (see GEM wells). Each such file will look something like:

AAACGGGTCAAAGTGA-1
AAAGATGCAATGGGAC-1
...
TTTGTCATCCGCACGA-1

Selecting the Cells of a Group of Interest

You may use a Group ID, determined perchance by exploring the data in Loupe scDNA Browser, as a proxy for a list of barcodes. In this case, all constituent cells of that group will be included.

As before, a text editor or Excel may be used to construct the CSV. Your spreadsheet may look something like this:

	A	B
1	library_id	node_id
2	normal	842
3	tumor_primary	912
4	tumor_metastases	919

When you save this to CSV, it will look something like this:

library_id,node_id
normal,842
tumor_primary,912
tumor_metastases,919

Guiding Clustering with a Custom Newick Tree

If you define more than one group in the configuration CSV, you must also guide clustering by providing a Newick-formatted tree. For instance, if you have defined three groups in the CSV: normal, tumor_primary, and tumor_metastases, you may force the pipeline to arrange them with the normal tissue as an outgroup:

(normal,(tumor_primary,tumor_metastases));

cellranger-dna reanalyze requires binary tree structure for any Newick input files. Polytomies are not supported.

Common Use Cases

These examples illustrate how you may use cellranger-dna reanalyze in some common situations.

1. Impose Upper or Lower Limits on the Copy Number Variation Analysis

When the outputs of a cellranger-dna cnv suggest the use of the options for imposing upper or lower limits on the average ploidy, cellranger-dna reanalyze is the most suitable way to do so, avoiding the significant computational overhead of the read processing stages of the cellranger-dna cnv.

2. Omit Replicating or Noisy Cells from the Analysis

You may wish to omit noisy or replicating cells from the outputs of the analsis. These cells can be identified from the per_cell_summary_metrics.csv or from exploration in the Loupe scDNA Browser. From these sources, a barcodes.csv file (with one cell barcode per line) may be constructed containing only the cell barcodes of interest, which when employed with cellranger-dna reanalyze will produce outputs with only those cells.

3. Imposing a Structure on the Clustering

As show in the example above, you may impose a self-chosen tree structure on the clustering using a Newick-formatted tree.

10x Genomics
Chromium Single Cell CNV

Customized Secondary Analysis using cellranger-dna reanalyze

Command Line Interface

Pipeline Outputs

Configuration

Selecting Cells Using a List of Cell Barcodes

Selecting the Cells of a Group of Interest

Guiding Clustering with a Custom Newick Tree

Common Use Cases

1. Impose Upper or Lower Limits on the Copy Number Variation Analysis

2. Omit Replicating or Noisy Cells from the Analysis

3. Imposing a Structure on the Clustering

About

Legal Notices

Resources

Headquarters

Social

10x GenomicsChromium Single Cell CNV

Customized Secondary Analysis using cellranger-dna reanalyze

Command Line Interface

Pipeline Outputs

Configuration

Selecting Cells Using a List of Cell Barcodes

Selecting the Cells of a Group of Interest

Guiding Clustering with a Custom Newick Tree

Common Use Cases

1. Impose Upper or Lower Limits on the Copy Number Variation Analysis

2. Omit Replicating or Noisy Cells from the Analysis

3. Imposing a Structure on the Clustering

10x Genomics
Chromium Single Cell CNV