Cell Ranger4.0, printed on 12/04/2024
In this tutorial, you will learn how to:
The cellranger reanalyze pipeline is optional. It allows you to rerun the secondary analysis for a completed cellranger count or aggr run with different parameters. It is faster than running the whole cellranger countpipeline over again because it starts from the feature barcode matrix and not from FASTQs, so all of the aligning and UMI counting is already done.
Start by making a directory.
mkdir ~/yard/run_cellranger_reanalyze cd ~/yard/run_cellranger_reanalyze
Next run the cellranger reanalyze command with --help to get the usage and a full list of modifiable parameters.
cellranger reanalyze --help
The output looks similar to this:
/mnt/home/user.name/yard/apps/cellranger-3.1.0/cellranger-cs/3.1.0/bin cellranger reanalyze (3.1.0) Copyright (c) 2019 10x Genomics, Inc. All rights reserved. ------------------------------------------------------------------------------- ... The commands below should be preceded by 'cellranger': Usage: reanalyze --id=ID --matrix=MATRIX_H5 [options] reanalyze[options] reanalyze -h | --help | --version
From this we can see that we need a matrix H5 file and a parameters CSV file. All of the modifyable parabeters are listed on the Customized Secondary Analysis using cellranger reanalyze page.
One of the more common reanalysis combinations is to increase the number of principle components used in clustering while increasing the number of clusters used in the k-means algorithm. If we use one of the publicly-available PBMC data sets, we might want to increase the number of PCAs and cluster to see if we can better separate out some of the rarer T-cell populations, such as T-regs. With this as our aim, we will start with the 1,000 PBMC experiment, and a 10,000 PBMC data set. For this run we only need to download the matrix in H5 format.
wget https://cf.10xgenomics.com/samples/cell-exp/3.0.0/pbmc_10k_v3/pbmc_10k_v3_filtered_feature_bc_matrix.h5
Next make the parameters CSV file. Here we use nano, but you can use any text editor.
nano reanalyze_10k_pbmcs.csv
Paste the following into your text file:
num_principal_comps,14 max_clusters,15
Save the file as reanalyze_10k_pbmcs.csv.
Next you build the command.
cellranger reanalyze --id=10k_pbmc_reanalyze_pc_clust --matrix=pbmc_10k_v3_filtered_feature_bc_matrix.h5 --params=reanalyze_10k_pbmcs.csv
The output loosk similar to this:
/mnt/home/user.name/yard/apps/cellranger-3.1.0/cellranger-cs/3.1.0/bin cellranger reanalyze (3.1.0) Copyright (c) 2019 10x Genomics, Inc. All rights reserved. ------------------------------------------------------------------------------- ... Pipestance completed successfully! 2019-09-13 18:51:37 Shutting down. Saving pipestance info to 10k_pbmc_reanalyze_pc_clust/10k_pbmc_reanalyze_pc_clust.mri.tgz
Now that the cellranger reanalzye pipeline is finished, look at the output.
ls -1 10k_pbmc_reanalyze_pc_clust/outs/
The output looks similar to this:
analysis cloupe.cloupe params.csv web_summary.html
By listing the contents of the clustering folder in the analysis folder, you can see that the pipeline did output 15 clusters.
ls -1 ls -1 10k_pbmc_reanalyze_pc_clust/outs/analysis/clustering/
graphclust
kmeans_10_clusters
kmeans_11_clusters
kmeans_12_clusters
kmeans_13_clusters
kmeans_14_clusters
kmeans_15_clusters
kmeans_2_clusters
kmeans_3_clusters
kmeans_4_clusters
kmeans_5_clusters
kmeans_6_clusters
kmeans_7_clusters
kmeans_8_clusters
kmeans_9_clusters
From here, explore the data further using the Loupe Browser or a number of other publicly available tools.