10x Genomics
Chromium Single Cell Gene Expression

Cell Ranger7.1, printed on 03/29/2025

Running cellranger aggr

In this tutorial, you will learn how to:

Get data
Create aggregation CSV
Set up the command for cellranger aggr
Run cellranger aggr
Explore the output of cellranger aggr

The cellranger aggr pipeline is optional. It is used to aggregate, or combine two cellranger count runs together. With experiments involving multiple samples, and multiple 10x Chromium GEM wells, libraries must each be processed in separate runs of cellranger count.

To compare samples to each other for differential expression analysis, cellranger aggr is used to combine output files from each run of cellranger count to produce one single feature-barcode matrix and a .cloupe file for visualizing with Loupe Browser.

This tutorial is written with Cell Ranger v6.1.2. Commands are compatible with other versions of Cell Ranger, unless noted otherwise.

Get data

Use the following publicly available molecule_info.h5 files:

Start by making a directory to run the aggr pipeline in:

mkdir run_cellranger_aggr
cd run_cellranger_aggr

Next, download the data files.

wget https://cf.10xgenomics.com/samples/cell-exp/3.0.0/pbmc_1k_v3/pbmc_1k_v3_molecule_info.h5
wget https://cf.10xgenomics.com/samples/cell-exp/3.0.0/pbmc_10k_v3/pbmc_10k_v3_molecule_info.h5

These are small files, less than 1GB each and usually take less than one minute to download.

Create aggregation CSV

The next step is to build the CSV file. CSV stands for comma separated value. For specific instructions for creating this CSV, see the cellranger aggr page.

The CSV file is a two-column file. The first column is for the sample id. This id name can be anything you want. Choose descriptive ids since they are used later in the analysis. The second column contains the paths to the molecule_info.h5 output files from the cellranger count pipelines.

For Cell Ranger v6.0+ and Loupe Browser v5.1.0+, the libraries CSV header should be sample_id,molecule_h5. For prior software versions, it should be library_id,molecule_h5.

From the same directory where the HDF5 files were downloaded, use the pwd command to print out the path:

pwd

The output is similar to the following:

path/to/run_cellranger_aggr

Copy the path to make the CSV file. Use the text editor of your choice to make this file. This example uses nano.

nano pbmc_aggr.csv

This opens the nano text editor.

sample_id,molecule_h5
1k_pbmcs,path/to/run_cellranger_aggr/pbmc_1k_v3_molecule_info.h5
10k_pbmcs,path/to/run_cellranger_aggr/pbmc_10k_v3_molecule_info.h5

Paste the text above into the editor. Edit the path/to/ part for each molecule_info.h5 file so it matches the path of the file on your system.

Exit out of the nano text editor by pressing keys and then pressing for "Yes" to save the file.

Save modified buffer (ANSWERING "No" WILL DESTROY CHANGES) ?                                                                                                    
 Y Yes
 N No           ^C Cancel

Nano then asks you:

File Name to Write: pbmc_aggr.csv

Press the key to confirm keeping this filename and saving the file. Now you are back to the command prompt.

We have now saved our Linux-formatted CSV file and exited out of the nano text editor.

Set up the command for cellranger aggr

Run the --help command to print the usage statement and view the input requirements.

cellranger aggr --help

This command prints the following:

cellranger-aggr
Aggregate data from multiple Cell Ranger runs
 
USAGE:
    cellranger aggr [FLAGS] [OPTIONS] --id  --csv 
 
FLAGS:
        --nosecondary    Disable secondary analysis, e.g. clustering
        --dry            Do not execute the pipeline. Generate a pipeline invocation (.mro) file and stop
        --disable-ui     Do not serve the web UI
        --noexit         Keep web UI running after pipestance completes or fails
        --nopreflight    Skip preflight checks
    -h, --help           Prints help information
 
OPTIONS:
        --id                A unique run id and output folder name [a-zA-Z0-9_-]+
...

This pipeline has two inputs:

--id is used to name the output directory that the pipeline runs in.
--csv takes a CSV file that points to the outputs from the cellranger count pipeline.

Run cellranger aggr

Next, build the command line and run it.

cellranger aggr --id=1k_10k_pbmc_aggr --csv=pbmc_aggr.csv

The output is similar to the following:

2021-10-28 19:59:07 [perform] Serializing pipestance performance data.
Waiting 6 seconds for UI to do final refresh.
Pipestance completed successfully!
 
2021-10-28 19:59:13 Shutting down.

Explore the output of cellranger aggr

Just like the other pipelines, when you see “Pipestance completed successfully!” the job is done, and the pipeline outputs are in the pipestance directory in the outs/ folder. List the contents of this directory:

ls -1 1k_10k_pbmc_aggr/outs/

The output is similar to the following:

├── aggregation.csv
├── count
│   ├── analysis
│   │   ├── clustering
│   │   ├── diffexp
│   │   ├── pca
│   │   ├── tsne
│   │   └── umap
│   ├── cloupe.cloupe
│   ├── filtered_feature_bc_matrix
│   │   ├── barcodes.tsv.gz
│   │   ├── features.tsv.gz
│   │   └── matrix.mtx.gz
│   ├── filtered_feature_bc_matrix.h5
│   └── summary.json
└── web_summary.html

The outputs are similar to those from the cellranger count pipeline, with the exception of the BAM files and molecule_info.h5 files. More information about outputs is available in the Understanding Outputs section.

Cell Ranger

Loupe

10x Genomics
Chromium Single Cell Gene Expression

Running cellranger aggr

Get data

Create aggregation CSV

Set up the command for cellranger aggr

Run cellranger aggr

Explore the output of cellranger aggr

Other tutorials in this series

About

Legal Notices

Resources

Headquarters

Social

Cell Ranger

Loupe

10x GenomicsChromium Single Cell Gene Expression

Running cellranger aggr

Get data

Create aggregation CSV

Set up the command for cellranger aggr

Run cellranger aggr

Explore the output of cellranger aggr

Other tutorials in this series

10x Genomics
Chromium Single Cell Gene Expression