HOME  ›   pipelines

# Running cellranger aggr

In this tutorial, you will learn how to:

The cellranger aggr pipeline is optional. It is used to aggregate, or combine two cellranger count runs together. With experiments involving multiple samples, and multiple 10X Chromium GEM wells, libraries must each be processed in separate runs of cellranger count.

To compare samples to each other for differential expression analysis, cellranger aggr is used to combine output files from each run of cellranger count to produce one single feature-barcode matrix and a .cloupe file for visualizing with Loupe Browser.

# Overview of cellranger aggr

First run the cellranger aggr pipeline with the ----help command to print the usage statement and view the input requirements.

cellranger aggr --help


The output is similar to the following:

/mnt/home/user.name/yard/apps/cellranger-3.1.0/cellranger-cs/3.1.0/bin
cellranger aggr (3.1.0)
-------------------------------------------------------------------------------
...
The commands below should be preceded by 'cellranger':
Usage:
aggr
--id=ID
--csv=CSV
[options]
aggr   [options]
aggr -h | --help | --version
...


This pipeline has two inputs:

• --id is used to name the output directory that the pipeline runs in.
• --csv takes a CSV file that points to the outputs from the cellranger count pipeline.

CSV stands for comma separated value. For specific instructions for creating this CSV, see the cellranger aggr page.

The CSV file is a two column file. The first column is for the library id. This id can be anything you want. Choose descriptive ids since they are used later in analysis. The second column contains the paths to the molecule_info.h5 output files from the cellranger count pipelines.

# Get Data

Use the following publicly available molecule_info.h5 files:

Start by making a directory to run the aggr pipeline in.

mkdir ~/yard/run_cellranger_aggr
cd ~/yard/run_cellranger_aggr


wget https://cf.10xgenomics.com/samples/cell-exp/3.0.0/pbmc_1k_v3/pbmc_1k_v3_molecule_info.h5
wget https://cf.10xgenomics.com/samples/cell-exp/3.0.0/pbmc_10k_v3/pbmc_10k_v3_molecule_info.h5


These are small files, less than1GB each and usually take less than one minute to download.

The next step is to build the CSV file. The path to the hdf5 f just downloaded is needed. From the same directory where the files were downloaded , use the pwd command to find the path.

pwd


The output is similar to the following:

/mnt/home/user.name/yard/run_cellranger_aggr


Copy the path to make the CSV file. Use the text editor of your choice to make this file. This example uses nano.

nano pbmc_aggr.csv


This opens the nano text editor. This is a Linux program. To access the command prompt again, exit from this program.

library_id,molecule_h5
1k_pbmcs,/mnt/home/user.name/yard/run_cellranger_aggr/pbmc_1k_v3_molecule_info.h5
10k_pbmcs,/mnt/home/user.name/yard/run_cellranger_aggr/pbmc_10k_v3_molecule_info.h5


Paste the text above into the editor. Edit the path to each molecule_info.h5file so it matches the path copied above, pointing to the path of the file on your system.

Exit out of the nano text edtior by pressing Ctrl-x and answering Y to save the file.

Save modified buffer (ANSWERING "No" WILL DESTROY CHANGES) ?
Y Yes
N No           ^C Cancel


File Name to Write: pbmc_aggr.csv


Press Enter to confirm keeping this filename and saving the file. Now you are back to the command prompt.

Save the file and out of the nano text editor, which is a Linux-formatted CSV file.

# Setup the Command for cellranger aggr

Next, build the command line and run it.

cellranger aggr --id=1k_10K_pbmc_aggr --csv=pbmc_aggr.csv


# Run cellranger aggr

The output is similar to the following:

/mnt/home/user.name/yard/apps/cellranger-3.1.0/cellranger-cs/3.1.0/bin
cellranger aggr (3.1.0)
-------------------------------------------------------------------------------

...

Waiting 6 seconds for UI to do final refresh.
Pipestance completed successfully!

2019-09-13 18:22:07 Shutting down.
Saving pipestance info to 1k_10K_pbmc_aggr/1k_10K_pbmc_aggr.mri.tgz


# Explore the Output of cellranger aggr

Just like the other pipelines, when you see “Pipestance completed successfully!” the job is done, and the pipeline outputs are in the pipestance directory in the outs folder. List the contents of this directory:

ls -1 1k_10K_pbmc_aggr/outs/


The output is similar to the following:

aggregation.csv
analysis
cloupe.cloupe
filtered_feature_bc_matrix
filtered_feature_bc_matrix.h5
raw_feature_bc_matrix
raw_feature_bc_matrix.h5
summary.json
web_summary.html


The outputs are similar to those from the cellranger count pipeline, with the exception of the BAM files and molecule_info.h5 files.