Cell Ranger2.1, printed on 11/23/2024
As mentioned in the workflow overview, Cell Ranger can be used to analyze multiple libraries. Depending on your exact scenario, the approach may be different. Here are three circumstances, from most to least common:
You created one library, but sequenced it more than once. This might have been to increase coverage depth, or for some other reason. You might have sequenced the library across different lanes of the same flowcell, or across multiple flow cells. In any of these cases, provided that this all comes from a single library, you can analyze these runs as a single sample using cellranger vdj by Specifying Input FASTQs.
You created distinct libraries. If you have two libraries of different types, for instance one that has been enriched for T cells and one for B cells, you can analyze them separately and then compare the output, for instance by opening them both in Loupe VDJ Browser together. This does not require doing anything different in the iCell Ranger pipeline; each library will be analyzed separately.
You created multiple compatible libraries. If you need to analyze these together, rather than analyzing them separately and comparing the output, you can do that. It just requires learning a little more about exactly how Cell Ranger works, and writing a customized pipeline configuration file, called an MRO. This page covers this cenario.
Cell Ranger uses a pipeline management framework called Martian. The pipeline and each stage in it is specified by a configuration file called an MRO file. Usually, the cellranger commands create the appropriate MRO files for you, but in the case that you want to do something outside the normal workflows, it is possible to create a custom MRO file to directly exercise the full range of features.
In the example below we describe how to construct an MRO file to specify multiple libraries as well as multiple flow cells (since most often the multiple libraries will have been sequenced on different runs).
The easiest way to write your own MRO is to start with the MRO from a previous pipeline. Assuming you have already run a single-flowcell sample (e.g., sample345), examine the _invocation file contained in its output directory.
Note: this example assumes that the input flowcells were processed with cellranger mkfastq.
$ cat sample345/_invocation @include "sc_vdj_assembler_cs.mro" call SC_VDJ_ASSEMBLER_CS( sample_id = "sample345", sample_def = [ { "fastq_mode": "ILMN_BCL2FASTQ", "gem_group": null, "lanes": null, "read_path": "/home/jdoe/runs/HBA2TADXX", "sample_indices": [ "any" ], "sample_names": [ "Sample1" ] } ], sample_desc = "", vdj_reference_path = "/opt/refdata-cellranger-vdj-GRCh38-alts-ensembl-2.0.0", force_cells = null, no_secondary_analysis = false, )
The sample_def argument controls the parameters used to define this sample and is a JSON-encoded list of maps that define:
Make a copy of this _invocation file; this will be the MRO from which we will build our multi-library invocation MRO.
Continuing with the example MRO above, we would make the following changes:
$ cp sample345/_invocation sample345-multi.mro $ nano sample345-multi.mro ... $ cat sample345-multi.mro @include "sc_vdj_assembler_cs.mro" call SC_VDJ_ASSEMBLER_CS( sample_id = "sample345-multi", sample_def = [ { "fastq_mode": "ILMN_BCL2FASTQ", "gem_group": 1, "lanes": null, "read_path": "/home/jdoe/runs/HAWT7ADXX", "sample_indices": [ "any" ], "sample_names": [ "Sample1" ] }, { "fastq_mode": "ILMN_BCL2FASTQ", "gem_group": 2, "lanes": null, "read_path": "/home/jdoe/runs/HAWPUADXX", "sample_indices": [ "any" ], "sample_names": [ "Sample1" ] } ], sample_desc = "", vdj_reference_path = "/opt/refdata-cellranger-vdj-GRCh38-alts-ensembl-2.0.0", force_cells = null, no_secondary_analysis = false, )
The cellular barcode sequences will include suffixes from the different gem groups, i.e. libraries.
AGAATGGTCTGCAT-1 CTGATCGATATCGA-1 GTAGCAACGTCGTA-2 AGAATGGTCTGCAT-2
This is how cellranger prevents the same barcode from different cells in different libraries from being erroneously combined into a single cell based only on the barcode sequence.
Once you have this single-sample, multi-library, multi-flowcell MRO, confirm that its syntax is valid with cellranger mrc, the MRO compiler included with Cell Ranger:
$ cellranger mrc sample345-multi.mro Successfully compiled 1 mro files.
A MRO ParseError: unexpected token error suggests that your sample_def entry is not valid JSON. Ensure that your commas are present if required (and absent if required) after each item. |
Then run the MRO file using cellranger's alternate MRO-mode syntax:
$ cellranger vdj sample345-multi sample345-multi.mro --uiport=3600 Martian Runtime - 2.3.2 Serving UI at http://localhost:3600 Running preflight checks (please wait)... 2016-11-14 17:26:40 [runtime] (ready) ID.patient123.SC_VDJ_ASSEMBLER_CS.SC_VDJ_ASSEMBLER.SETUP_CHUNKS 2016-11-14 17:26:43 [runtime] (split_complete) ID.patient123.SC_VDJ_ASSEMBLER_CS.SC_VDJ_ASSEMBLER.SETUP_CHUNKS 2016-11-14 17:26:43 [runtime] (run:local) ID.patient123.SC_VDJ_ASSEMBLER_CS.SC_VDJ_ASSEMBLER.SETUP_CHUNKS.fork0.chnk0.main
where