Software  ›   pipelines

# Running Multi-Flowcell Samples

The cellranger count command provides the means to define samples according to the output of a single cellranger mkfastq command's output FASTQs. While this sample specification (i.e., one flowcell per sample) is the most common, high-depth sequencing often involves either sequencing a single sample or combining multiple samples across multiple flowcells. To this end, the cellranger pipeline allows complex sample construction such as combining sample indices from multiple flowcells into a single cellranger run.

To create complex sample specifications, you will have to write your own MRO file for the cellranger pipeline. MRO is the language used to define pipelines to the Martian pipeline framework which is responsible for managing pipeline execution. The cellranger command is simply a shell script that converts command line arguments into an MRO file which is passed to the Martian pipeline execution command, cellranger mrp, and writing MROs directly allows you to access the full range of options available for each pipeline.

## Understanding the Pipeline Invocation MRO

The easiest way to write your own MRO is to start with the MRO from a previous pipeline. Assuming you have already run a single-flowcell sample (e.g., sample345) you will be using for your multi-flowcell sample, examine the _invocation file contained in its output directory.

Note: this example assumes that the input flowcells were processed with cellranger mkfastq.

$cat sample345/_invocation @include "sc_rna_counter_cs.mro" call SC_RNA_COUNTER_CS( sample_id = "sample345", sample_def = [ { "fastq_mode": "ILMN_BCL2FASTQ", "gem_group": null, "lanes": null, "read_path": "/home/jdoe/runs/HBA2TADXX", "sample_indices": [ "any" ], "sample_names": [ "Sample1" ] } ], sample_desc = "", reference_path = "/opt/refdata-cellranger-GRCh38-1.2.0/GRCh38", recovered_cells = null, )  The sample_def argument controls the parameters used to define this sample and is a JSON-encoded list of maps that define: • fastq_mode - set this to "ILMN_BCL2FASTQ" • gem_group - indicates GEM chip channel corresponding to a single sample across multiple flowcells. This field will be described in more detail in the next section. • lanes - a list of lanes from this flowcell to be included in this sample (e.g., [ 1, 2 ], [ 2 ], etc) or null to use all lanes • read_path - a directory containing FASTQs from a single flowcell • sample_indices - set this to "any" when working with mkfastq output • sample_names - a list of names associated with particular sample indices (as specified in the mkfastq sample sheet for this flowcell) Make a copy of this _invocation file; this will be the MRO from which we will build our multi-flowcell invocaction MRO. ## Combining Multiple Flowcells Continuing with the example MRO above, we would make the following changes: 1. Give this sample a new sample_id. 2. Duplicate the dict contained in the sample_def definition as a second item in the sample_def list 3. Change the read_path for each of these sample_def objects to point to the HAWT7ADXX and HAWPUADXX FASTQ output directories 4. Change lanes and/or sample_names to reflect the flowcell configuration used in sequencing, if necessary 5. Change gem_group to reflect single or multiple sample configuration across flowcells ### Analyzing a Single Sample or Library Across Multiple Flowcells It may be useful to sequence a sample multiple times that was generated from the same GEM chip channel. To set up a single sample multi-flowcell cellranger run, the gem_group field in each sample_def argument must be set to null. cellranger will then set the gem group to 1 for all flowcells within the sample. For example, the two-flowcell MRO may appear as $ cp sample345/_invocation sample345-multi.mro
$nano sample345-multi.mro ...$ cat sample345-multi.mro

@include "sc_rna_counter_cs.mro"

call SC_RNA_COUNTER_CS(
sample_id = "sample345-multi",
sample_def = [
{
"fastq_mode": "ILMN_BCL2FASTQ",
"gem_group": null,
"lanes": null,
"sample_indices": [ "any" ],
"sample_names": [ "Sample1" ]
},
{
"fastq_mode": "ILMN_BCL2FASTQ",
"gem_group": null,
"lanes": null,
"sample_indices": [ "any" ],
"sample_names": [ "Sample1" ]
}
],
sample_desc = "",
reference_path = "/opt/refdata-cellranger-GRCh38-1.2.0/GRCh38",
recovered_cells = null,
)


where the changes from the original MRO are highlighted.

All cellular barcode sequences produced in the BAM from cellranger will include a suffix 1.

AGAATGGTCTGCAT-1

#### Analyzing Multiple Samples or Libraries Across Multiple Flowcells

The officially supported method of doing this is to run cellranger count on each library as per the single-library instructions above, then combine the results using the more efficient cellranger aggr pipeline as covered here. The method described below is included only for backwards compatibility.

To set up a multiple sample multi-flowcell cellranger run, the gem_group field in each sample_def argument must be set to incrementally increasing integers, starting with 1.

$cp sample345/_invocation sample345-multi.mro$ nano sample345-multi.mro
...

$cat sample345-multi.mro @include "sc_rna_counter_cs.mro" call SC_RNA_COUNTER_CS( sample_id = "sample345-multi", sample_def = [ { "fastq_mode": "ILMN_BCL2FASTQ", "gem_group": 1, "lanes": null, "read_path": "/home/jdoe/runs/HAWT7ADXX", "sample_indices": [ "any" ], "sample_names": [ "Sample1" ] }, { "fastq_mode": "ILMN_BCL2FASTQ", "gem_group": 2, "lanes": null, "read_path": "/home/jdoe/runs/HAWPUADXX", "sample_indices": [ "any" ], "sample_names": [ "Sample1" ] } ], sample_desc = "", reference_path = "/opt/refdata-cellranger-GRCh38-1.2.0/GRCh38", recovered_cells = null, )  The cellular barcode sequences will include suffixes from the different gem groups. AGAATGGTCTGCAT-1 GTAGCAACGTCGTA-2  ### Running Cell Ranger Once you have this multi-flowcell MRO, confirm that its syntax is valid with cellranger mrc, the MRO compiler included with Cell Ranger: $ cellranger mrc sample345-multi.mro
Successfully compiled 1 mro files.


Then run the MRO file using cellranger's alternate MRO-mode syntax:

\$ cellranger count sample345-multi sample345-multi.mro --uiport=3600
Martian Runtime - 2.1.2
Serving UI at http://localhost:3600