10x Genomics
Chromium Single Cell ATAC

Cell Ranger ATAC2.1, printed on 04/05/2025

Generating FASTQs with cellranger-atac mkfastq

Overview
Example workflows
Arguments and options
Example data
Running mkfastq with a simple CSV samplesheet
Running mkfastq with an Illumina Experiment Manager sample sheet
Checking FASTQ output
Troubleshooting

Overview

The cellranger-atac workflow starts by demultiplexing the Illumina sequencer's base call files (BCLs) for each flow cell directory into FASTQ files. 10x Genomics recommends using cellranger-atac mkfastq, a pipeline that wraps bcl2fastq from Illumina and provides a number of convenient features in addition to the features of bcl2fastq:

Translates 10x Genomics sample index set names into the corresponding list of four sample index oligonucleotides. For example, well A1 can be specified in the sample sheet as SI-NA-A1, and cellranger-atac mkfastq will recognize the four oligos (AAACGGCG, CCTACCAT, GGCGTTTC, and TTGTAAGA) and merge the resulting FASTQ files.
Supports a simplified CSV sample sheet format to handle 10x Genomics use cases.
Supports most bcl2fastq arguments, such as --use-bases-mask.

Example workflows

In this example, we have two 10x Genomics libraries (each processed through a separate Chromium chip channel) that are multiplexed on a single flow cell. Note that after running cellranger-atac mkfastq, we run a separate instance of the cellranger-atac pipeline on each library:

two libraries, one flow cell

In this example, we have one 10x Genomics library sequenced on two flow cells. Note that after running cellranger-atac mkfastq, we run a single instance of the cellranger-atac pipeline on all the FASTQ files generated:

one library, two flow cells

Arguments and options

The cellranger-atac mkfastq pipeline will accept additional options beyond those shown in the table below because it is a wrapper around bcl2fastq from Illumina®. Please consult Illumina's bcl2fastq User Guide for more information.

Parameter	Function
`--run`	Required. The path of Illumina BCL run folder.
`--id`	Optional; defaults to the name of the flow cell referred to by `--run`. Name of the folder created by `mkfastq`.
`--samplesheet`	Optional. Path to an Illumina Experiment Manager-compatible sample sheet which contains 10x Genomics sample index names (e.g., SI-NA-A12) in the sample index column. All other information, such as sample names and lanes, should be in the sample sheet.
`--sample-sheet`	Optional. Equivalent to `--samplesheet` above.
`--csv`	Optional. Path to a simple CSV with lane, sample, and index columns, which describe the way to demultiplex the flow cell. The index column should contain a 10x Genomics sample dual-index name (e.g., SI-NA-A12). This is an alternative to the Illumina IEM sample sheet, and will be ignored if `--samplesheet` is specified.
`--simple-csv`	Optional. Equivalent to `--csv` above.
`--lanes`	bcl2fastq option. Comma-delimited series of lanes to demultiplex (e.g. 1,3). Use this if you have a sample sheet for an entire flow cell but only want to generate a few lanes for further 10x Genomics analysis.
`--use-bases-mask`	bcl2fastq option. Same meaning as for `bcl2fastq`. Use to clip extra bases off a read if you ran extra cycles for QC.
`--delete-undetermined`	bcl2fastq option. Delete the `Undetermined` FASTQs generated by `bcl2fastq`. Useful if you are demultiplexing a small number of samples from a large flow cell.
`--barcode-mismatches`	bcl2fastq option. Same meaning as for `bcl2fastq`. Use this option to change the number of allowed mismatches per index adapter (0, 1, 2). Default: 1.
`--output-dir`	bcl2fastq option. Generate FASTQ output in a path of your own choosing, instead of `flow_cell_id/outs/fastq_path`.
`--project`	bcl2fastq option. Custom project name, to override the sample sheet or to use in conjunction with the `--csv` argument.
`--jobmode`	Martian option. Job manager to use. Valid options: `local` (default), `sge`, `lsf`, `slurm` or a `.template` file.
`--localcores`	Martian option. Set max cores the pipeline may request at one time. Only applies when `--jobmode=local`.
`--localmem`	Martian option. Set max GB the pipeline may request at one time. Only applies when `--jobmode=local`.

Example data

The cellranger-atac mkfastq pipeline recognizes two file formats for describing samples: a simple, three-column CSV format, or the Illumina Experiment Manager (IEM) sample sheet format used by bcl2fastq. There is an example below for running mkfastq with each format.

The example (tiny-bcl) dataset is solely designed to demo the cellranger-atac mkfastq pipeline. It cannot be used to run downstream pipelines (e.g. cellranger-atac count).

To follow along, please do the following:

Download the tiny-bcl tar file.
Untar the tiny-bcl tar file in a convenient location. This will create a new tiny-bcl/ subdirectory.
Download the simple CSV layout file: cellranger-atac-tiny-bcl-simple-1.0.0.csv.
Download the Illumina Experiment Manager sample sheet: cellranger-atac-tiny-bcl-samplesheet-1.0.0.csv.

Running mkfastq with a simple CSV sample sheet

We recommend the simple CSV sample sheet for most sequencing experiments. The simple CSV format has only three columns (Lane, Sample, Index), and is thus less prone to formatting errors. You can see an example of this in cellranger-atac-tiny-bcl-simple-1.0.0.csv:

Lane,Sample,Index
1,test_sample,SI-NA-C1

Here are the options for each column:

Lane	Which lane(s) of the flow cell to process. Can be either a single lane, a range (e.g., 2-4) or '*' for all lanes in the flow cell.
Sample	The name of the sample. This name will be the prefix to all the generated FASTQs, and will correspond to the `--sample` argument in all downstream 10x Genomics pipelines. Sample names must conform to the Illumina `bcl2fastq` naming requirements. Only letters, numbers, underscores, and hyphens are allowed; no other symbols, including dots ("."), are allowed.
Index	The 10x Genomics sample index set that was used in library construction, e.g., SI-NA-A12.

To run mkfastq with a simple layout CSV, use the --csv argument. Here's how to run mkfastq on the tiny-bcl sequencing run with the simple layout (replace code in red with the path to tiny_bcl on your system):

$ cellranger-atac mkfastq --id=tiny-bcl \
                     --run=/path/to/tiny_bcl \
                     --csv=cellranger-atac-tiny-bcl-simple-1.0.0.csv
 
cellranger-atac mkfastq
Copyright (c) 2018 10x Genomics, Inc.  All rights reserved.
-------------------------------------------------------------------------------

Martian Runtime - 2.1.0-4.0.7
Running preflight checks (please wait)...

Running mkfastq with an Illumina Experiment Manager sample sheet

The cellranger-atac mkfastq pipeline can also be run with a sample sheet in the Illumina Experiment Manager (IEM) format (example: cellranger-atac-tiny-bcl-samplesheet-1.0.0.csv). An IEM sample sheet has several fields specific to running on Illumina platforms, including a [Data] section where sample and index information is specified. cellranger-atac mkfastq supports listing either index set names or the oligo sequences.

Do not trim adapters during demultiplexing. Leave these settings blank. Trimming adapters from reads can potentially damage the 10x barcodes and the UMIs, resulting in pipeline failure or data loss.

If you are using an Illumina sample sheet for demultiplexing with bcl2fastq, BCL Convert or our mkfastq pipeline, please remove these lines under the[Settings]section: Adapter or AdapterRead1 or AdapterRead2.

Here's an example:

Version 1: "SI-NA-C1" refers to a 10x Genomics single-indexed sample index consisting of a set of four oligo sequences. In this example, only reads from lane 1 will be used. To demultiplex the given sample index across all lanes, omit the lanes column entirely.

[Data]
Lane,Sample_ID,index
1,test_sample,SI-NA-C1

Version 2: The four index sequences for "SI-NA-C1" are specified in separate rows under the index column.

[Data]
Lane,Sample_ID,index
1,sample1,ATCTGATC
1,sample1,CGTGCTAA
1,sample1,GAGAAGGG
1,sample1,TCACTCCT

Sample names must conform to the Illumina bcl2fastq naming requirements. Specifically only letters, numbers, underscores, and hyphens are allowed. No other symbols, including dots ("."), are allowed.

Also note that while an authentic IEM sample sheet will contain other sections above the [Data] section, these are optional for demultiplexing. To avoid data loss from trimming, we do not recommend including adapter sequences in the [Settings] section of the sample sheet (see this article for details). For demultiplexing an existing run with cellranger-atac mkfastq, only the [Data] section is required.

Next, run the cellranger-atac mkfastq pipeline, using the --samplesheet argument (replace code in red with the path to tiny_bcl on your system):

$ cellranger-atac mkfastq --id=tiny-bcl \
                     --run=/path/to/tiny_bcl \
                     --samplesheet=cellranger-atac-tiny-bcl-samplesheet-1.0.0.csv
 
cellranger-atac mkfastq
Copyright (c) 2018 10x Genomics, Inc.  All rights reserved.
-------------------------------------------------------------------------------
 
Martian Runtime - 4.0.7
Running preflight checks (please wait)...

If you encounter any preflight errors, please refer to the Troubleshooting page.

Checking FASTQ output

Once the cellranger-atac mkfastq pipeline has successfully completed, the output can be found in a new folder named with the value you provided to cellranger-atac mkfastq in the --id option (if not specified, defaults to the name of the flow cell):

$ ls -l
drwxr-xr-x 4 jdoe  jdoe     4096 Sep 13 12:05 tiny-bcl

The key output files can be found in outs/fastq_path, and is organized in the same manner as a conventional bcl2fastq run:

$ ls -l tiny-bcl/outs/fastq_path/
drwxr-xr-x 3 jdoe jdoe         3 Aug  9 12:26 Reports
drwxr-xr-x 2 jdoe jdoe         8 Aug  9 12:26 Stats
drwxr-xr-x 3 jdoe jdoe         3 Aug  9 12:26 tiny-bcl
-rw-r--r-- 1 jdoe jdoe  20615106 Aug  9 12:26 Undetermined_S0_L001_I1_001.fastq.gz
-rw-r--r-- 1 jdoe jdoe 151499694 Aug  9 12:26 Undetermined_S0_L001_R1_001.fastq.gz
-rw-r--r-- 1 jdoe jdoe  52692701 Aug  9 12:26 Undetermined_S0_L001_R2_001.fastq.gz
-rw-r--r-- 1 jdoe jdoe 151499694 Aug  9 12:26 Undetermined_S0_L001_R3_001.fastq.gz
 
$ tree tiny-bcl/outs/fastq_path/tiny_bcl/
tiny-bcl/outs/fastq_path/tiny_bcl/
  Sample1
    Sample1_S1_L001_I1_001.fastq.gz
    Sample1_S1_L001_R1_001.fastq.gz
    Sample1_S1_L001_R2_001.fastq.gz
    Sample1_S1_L001_R3_001.fastq.gz

This example was produced with a sample sheet that included tiny-bcl as the Sample_Project, so the directory containing the sample folders is called tiny-bcl/. If a Sample_Project was not specified, or if a simple layout CSV file was used (which does not have a Sample_Project column), the directory containing the sample folders would be named according to the flow cell ID instead.

If you want to remove the Undetermined FASTQs from the output to save space, you can run mkfastq with the --delete-undetermined flag. To see all cellranger-atac mkfastq options, run cellranger-atac mkfastq --help.

For single cell ATAC chemistry, the cell barcode that labels cells, not to be confused with the sample index that multiplexes libraries on the flow cell, is sequenced as part of the i5 index read (named R2 in the FASTQs). Both mkfastq and bcl2fastq conventionally associate R2 with the i5 index read, and R3 with read2. Thus read1, barcode, read2, and sample index are associated with R1, R2, R3, I1, respectively. This is reflected in the output files shown in the output examples in this guide.

Troubleshooting

If you encounter a crash while running cellranger-atac mkfastq, upload the tarball (with the extension .mri.tgz) in your output directory. Replace the code in red with your email:

$ cellranger-atac upload your@email.edu jobid.mri.tgz

where jobid is what you input into the --id option of mkfastq (if not specified, defaults to the ID of the flow cell). This tarball contains numerous diagnostic logs that 10x Genomics support can use for debugging.

You will receive an automated email from 10x Genomics. If not, email support@10xgenomics.com. For the fastest service, respond with the following:

The exact cellranger-atac command you used.
The sample sheet that you used.
The RunInfo.xml and runParameters.xml files from your BCL directory.
The kind of libraries you are demultiplexing (including chemistry).

Cell Ranger ATAC

Loupe

10x Genomics
Chromium Single Cell ATAC

Generating FASTQs with cellranger-atac mkfastq

Table of Contents

Overview

Example workflows

Arguments and options

Example data

Running mkfastq with a simple CSV sample sheet

Running mkfastq with an Illumina Experiment Manager sample sheet

Checking FASTQ output

Troubleshooting

About

Legal Notices

Resources

Headquarters

Social

Cell Ranger ATAC

Loupe

10x GenomicsChromium Single Cell ATAC

Generating FASTQs with cellranger-atac mkfastq

Table of Contents

Overview

Example workflows

Arguments and options

Example data

Running mkfastq with a simple CSV sample sheet

Running mkfastq with an Illumina Experiment Manager sample sheet

Checking FASTQ output

Troubleshooting

10x Genomics
Chromium Single Cell ATAC