HOME  ›   pipelines
If your question is not answered here, please email us at:  ${email.software}

Cell Ranger


10x Genomics
Chromium Single Cell Gene Expression

Running cellranger multi

This tutorial introduces the cellranger multi pipeline for Cell Ranger 6.0+ (we recommend completing the other tutorials in this series first).

In this tutorial, you will learn how to:

Note: This tutorial is written with Cell Ranger v6.1.2. Commands are compatible with previous versions, unless noted otherwise.

Get data

In this tutorial, we will analyze a Cell Multiplexing dataset that consists of two cell lines, Jurkat and Raji, multiplexed at equal proportions with one CMO per cell line, resulting in a pooled sample labeled with two CMOs. Gene Expression (GEX) and Cell Multiplexing libraries were prepared with the Chromium Next GEM Single Cell 3ʹ Reagent Kits v3.1 (Dual Index) with Feature Barcode technology.

Use wget to download the FASTQ data (about 44 GB):

wget https://cg.10xgenomics.com/samples/cell-exp/6.0.0/SC3_v3_NextGem_DI_CellPlex_Jurkat_Raji_10K_Multiplex/SC3_v3_NextGem_DI_CellPlex_Jurkat_Raji_10K_Multiplex_fastqs.tar

Download and untar the 2020 reference, if you have not already done so in the count tutorial:

wget https://cf.10xgenomics.com/supp/cell-exp/refdata-gex-GRCh38-2020-A.tar.gz
tar -xf refdata-gex-GRCh38-2020-A.tar.gz

Untar the FASTQ files:

tar -xf SC3_v3_NextGem_DI_CellPlex_Jurkat_Raji_10K_Multiplex_fastqs.tar

Navigate to the FASTQ files and observe their filenames. There is one directory that contains the FASTQ files for the GEX library. There are two that contain FASTQ files for the Cell Multiplexing Capture library because the same physical library was sequenced twice for this particular dataset - first for a preliminary sample quality check and second for the actual analysis.

The simplest scenario is to analyze one Gene Expression and one Multiplexing Capture library, which we will demonstrate using the FASTQ files in the "..._1_gex" and "..._1_multiplexing_capture" directories.

├── SC3_v3_NextGem_DI_CellPlex_Jurkat_Raji_10K_1_gex
│   ├── SC3_v3_NextGem_DI_CellPlex_Jurkat_Raji_10K_1_gex_S2_L001_I1_001.fastq.gz
│   ├── SC3_v3_NextGem_DI_CellPlex_Jurkat_Raji_10K_1_gex_S2_L001_I2_001.fastq.gz
│   ├── SC3_v3_NextGem_DI_CellPlex_Jurkat_Raji_10K_1_gex_S2_L001_R1_001.fastq.gz
│   ├── SC3_v3_NextGem_DI_CellPlex_Jurkat_Raji_10K_1_gex_S2_L001_R2_001.fastq.gz
│   ├── SC3_v3_NextGem_DI_CellPlex_Jurkat_Raji_10K_1_gex_S2_L002_I1_001.fastq.gz
│   ├── SC3_v3_NextGem_DI_CellPlex_Jurkat_Raji_10K_1_gex_S2_L002_I2_001.fastq.gz
│   ├── SC3_v3_NextGem_DI_CellPlex_Jurkat_Raji_10K_1_gex_S2_L002_R1_001.fastq.gz
│   ├── SC3_v3_NextGem_DI_CellPlex_Jurkat_Raji_10K_1_gex_S2_L002_R2_001.fastq.gz
│   ├── SC3_v3_NextGem_DI_CellPlex_Jurkat_Raji_10K_1_gex_S2_L003_I1_001.fastq.gz
│   ├── SC3_v3_NextGem_DI_CellPlex_Jurkat_Raji_10K_1_gex_S2_L003_I2_001.fastq.gz
│   ├── SC3_v3_NextGem_DI_CellPlex_Jurkat_Raji_10K_1_gex_S2_L003_R1_001.fastq.gz
│   └── SC3_v3_NextGem_DI_CellPlex_Jurkat_Raji_10K_1_gex_S2_L003_R2_001.fastq.gz
├── SC3_v3_NextGem_DI_CellPlex_Jurkat_Raji_10K_1_multiplexing_capture
│   ├── SC3_v3_NextGem_DI_CellPlex_Jurkat_Raji_10K_1_multiplexing_capture_S1_L001_I1_001.fastq.gz
│   ├── SC3_v3_NextGem_DI_CellPlex_Jurkat_Raji_10K_1_multiplexing_capture_S1_L001_I2_001.fastq.gz
│   ├── SC3_v3_NextGem_DI_CellPlex_Jurkat_Raji_10K_1_multiplexing_capture_S1_L001_R1_001.fastq.gz
│   ├── SC3_v3_NextGem_DI_CellPlex_Jurkat_Raji_10K_1_multiplexing_capture_S1_L001_R2_001.fastq.gz
│   ├── SC3_v3_NextGem_DI_CellPlex_Jurkat_Raji_10K_1_multiplexing_capture_S1_L002_I1_001.fastq.gz
│   ├── SC3_v3_NextGem_DI_CellPlex_Jurkat_Raji_10K_1_multiplexing_capture_S1_L002_I2_001.fastq.gz
│   ├── SC3_v3_NextGem_DI_CellPlex_Jurkat_Raji_10K_1_multiplexing_capture_S1_L002_R1_001.fastq.gz
│   ├── SC3_v3_NextGem_DI_CellPlex_Jurkat_Raji_10K_1_multiplexing_capture_S1_L002_R2_001.fastq.gz
│   ├── SC3_v3_NextGem_DI_CellPlex_Jurkat_Raji_10K_1_multiplexing_capture_S1_L003_I1_001.fastq.gz
│   ├── SC3_v3_NextGem_DI_CellPlex_Jurkat_Raji_10K_1_multiplexing_capture_S1_L003_I2_001.fastq.gz
│   ├── SC3_v3_NextGem_DI_CellPlex_Jurkat_Raji_10K_1_multiplexing_capture_S1_L003_R1_001.fastq.gz
│   └── SC3_v3_NextGem_DI_CellPlex_Jurkat_Raji_10K_1_multiplexing_capture_S1_L003_R2_001.fastq.gz
└── SC3_v3_NextGem_DI_CellPlex_Jurkat_Raji_10K_2_multiplexing_capture
    ├── SC3_v3_NextGem_DI_CellPlex_Jurkat_Raji_10K_2_multiplexing_capture_S1_L001_I1_001.fastq.gz
    ├── SC3_v3_NextGem_DI_CellPlex_Jurkat_Raji_10K_2_multiplexing_capture_S1_L001_I2_001.fastq.gz
    ├── SC3_v3_NextGem_DI_CellPlex_Jurkat_Raji_10K_2_multiplexing_capture_S1_L001_R1_001.fastq.gz
    └── SC3_v3_NextGem_DI_CellPlex_Jurkat_Raji_10K_2_multiplexing_capture_S1_L001_R2_001.fastq.gz

Create multi config CSV

The cellranger multi pipeline has two inputs:

In this tutorial, you only need to edit a few lines in a pre-made CSV using a text editor of your choice, in this example with nano:

nano SC3_v3_NextGem_DI_CellPlex_Jurkat_Raji_10K.csv

Copy and paste the code block below into your text editor. Note there are three sections: gene-expression, libraries, and samples. Important: replace the /path/to/ text with the full paths to the reference (in [gene-expression] section) and FASTQ files (in [libraries] section) that you downloaded before saving the CSV file.

For a full list of the different sections, fields, and optional parameters, please see the running cellranger multi page.

[libraries] fastq_id,fastqs,lanes,feature_types SC3_v3_NextGem_DI_CellPlex_Jurkat_Raji_10K_1_gex,/path/to/SC3_v3_NextGem_DI_CellPlex_Jurkat_Raji_10K/SC3_v3_NextGem_DI_CellPlex_Jurkat_Raji_10K_1_gex,any,Gene Expression SC3_v3_NextGem_DI_CellPlex_Jurkat_Raji_10K_1_multiplexing_capture,/path/to/SC3_v3_NextGem_DI_CellPlex_Jurkat_Raji_10K/SC3_v3_NextGem_DI_CellPlex_Jurkat_Raji_10K_1_multiplexing_capture,any,Multiplexing Capture
[samples] sample_id,cmo_ids,description Jurkat,CMO301,Jurkat Raji,CMO302,Raji

Set up the command for cellranger multi

Next run the cellranger multi command with --help to get the usage and a full list of modifiable parameters.

cellranger multi --help

The output looks similar to this:

Analyze multiplexed data or combined gene expression/immune profiling/feature
barcode data
    cellranger multi [FLAGS] [OPTIONS] --id  --csv 
        --dry            Do not execute the pipeline. Generate a pipeline
                         invocation (.mro) file and stop
        --disable-ui     Do not serve the web UI
        --noexit         Keep web UI running after pipestance completes or fails
        --nopreflight    Skip preflight checks
    -h, --help           Prints help information
        --id                A unique run id and output folder name [a-zA-Z0-
        --description     Sample description to embed in output files
                                [default: ]
        --csv              Path of CSV file enumerating input libraries and
                                analysis parameters

Run cellranger multi

To run cellranger multi, enter a command such as:

cellranger multi --id=Jurkat_Raji_10K --csv=SC3_v3_NextGem_DI_CellPlex_Jurkat_Raji_10K.csv

Cell Ranger 6.0+ should start with a message like this:

Martian Runtime - v4.0.2-15-g22715e4
Running preflight checks (please wait)... 

Depending on your computational resources, it may take some time for the pipeline to complete. When it does, it should conclude with a message like this:

2021-02-13 15:24:18 [perform] Serializing pipestance performance data.
Waiting 6 seconds for UI to do final refresh.
Pipestance completed successfully!
2021-02-13 15:24:24 Shutting down.

Explore the output of cellranger multi

Next, examine the output files using the tree command:

cd Jurkat_Raji_10K/outs

The tree command will list 73 directories with 96 files. (If you are used to a cellranger count run, recall that multiplexing two samples necessitates doubling the per-sample outputs, and these numbers will grow correspondingly as more samples are multiplexed into a single GEM well). Additionally, there are some output files that are general to the entire experiment rather than a specific CMO.

The first section of the outputs contains the config.csv file, a duplicate of the input config CSV (SC3_v3_NextGem_DI_CellPlex_Jurkat_Raji_10K.csv). The multi directory contains a count directory and a multiplexing_analysis directory:

└─ config.csv
└─ multi
   ├── count
   │   ├── feature_reference.csv
   │   ├── raw_cloupe.cloupe
   │   ├── raw_feature_bc_matrix
   │   │   ├── barcodes.tsv.gz
   │   │   ├── features.tsv.gz
   │   │   └── matrix.mtx.gz
   │   ├── raw_feature_bc_matrix.h5
   │   ├── raw_molecule_info.h5
   │   ├── unassigned_alignments.bam
   │   └── unassigned_alignments.bam.bai
   └── multiplexing_analysis
       ├── assignment_confidence_table.csv
       ├── cells_per_tag.json
       ├── tag_calls_per_cell.csv
       └── tag_calls_summary.csv

For more information on these files, see Cell Multiplexing Outputs.

The per_sample_outs directory contains two directories, one for Jurkat and one for Raji. For brevity, only the Jurkat outputs are shown here.

In the Jurkat/count/analysis directory, the clustering directory contains CSV files with the results of graph-based clusters and K-means clustering from 2-10:

└─ clustering
  ├── graphclust
  │   └── clusters.csv
  ├── kmeans_10_clusters
  │   └── clusters.csv
  ├── kmeans_2_clusters
  │   └── clusters.csv
  ├── kmeans_3_clusters
  │   └── clusters.csv
  ├── kmeans_4_clusters
  │   └── clusters.csv
  ├── kmeans_5_clusters
  │   └── clusters.csv
  ├── kmeans_6_clusters
  │   └── clusters.csv
  ├── kmeans_7_clusters
  │   └── clusters.csv
  ├── kmeans_8_clusters
  │   └── clusters.csv
  └── kmeans_9_clusters
      └── clusters.csv

The diffexp directory likewise contains CSV files with the results of differential expression analysis between the clusters reported above:

└─ diffexp
  ├── graphclust
  │   └── differential_expression.csv
  ├── kmeans_10_clusters
  │   └── differential_expression.csv
  ├── kmeans_2_clusters
  │   └── differential_expression.csv
  ├── kmeans_3_clusters
  │   └── differential_expression.csv
  ├── kmeans_4_clusters
  │   └── differential_expression.csv
  ├── kmeans_5_clusters
  │   └── differential_expression.csv
  ├── kmeans_6_clusters
  │   └── differential_expression.csv
  ├── kmeans_7_clusters
  │   └── differential_expression.csv
  ├── kmeans_8_clusters
  │   └── differential_expression.csv
  └── kmeans_9_clusters
      └── differential_expression.csv

The pca, tsne, and umap directories contain CSV files for dimensionality reduction:

├── pca
│   └── 10_components
│       ├── components.csv
│       ├── dispersion.csv
│       ├── features_selected.csv
│       ├── projection.csv
│       └── variance.csv
├── tsne
│   ├── 2_components
│   │   └── projection.csv
│   └── multiplexing_capture_2_components
│       └── projection.csv
└── umap
    ├── 2_components
    │   └── projection.csv
    └── multiplexing_capture_2_components
        └── projection.csv

The remaining per_sample_outs are essentially the same as from cellranger count, and are described in the Understanding Outputs multi section.

├── cloupe.cloupe
│── feature_reference.csv
│── sample_alignments.bam
│── sample_alignments.bam.bai
│── sample_barcodes.csv
│── sample_feature_bc_matrix
│   ├── barcodes.tsv.gz
│   ├── features.tsv.gz
│   └── matrix.mtx.gz
│── sample_feature_bc_matrix.h5
│── sample_molecule_info.h5
│── metrics_summary.csv
└── web_summary.html

Questions or feedback about this tutorial? Contact [email protected].

Other tutorials in this series