10x Genomics

, printed on 04/22/2025

Converting 10x Genomics BAM Files to FASTQ

bamtofastq is a tool for converting 10x Genomics BAM files back into FASTQ files that can be used as inputs to re-run analysis. The FASTQs will be output into a directory structure identical to the mkfastq or bcl2fastq tools, so they are ready to input into the next pipeline (e.g. cellranger count, spaceranger count).

Background
Download and Installation
Running the Tool
Options
Known Issues and Limitations

Background

10x Genomics pipelines require FASTQs (with embedded barcodes) as input, so we developed bamtofastq to help users who want to reanalyze 10x Genomics data but only have access to BAM files. For example, some users may want to store BAM files only. Others might have downloaded our BAM data from NCBI SRA. The location of the 10x Genomics barcode varies depending on product and reagent version. bamtofastq determines the appropriate way to construct the original read sequence from the sequences and tags in the BAM file.

bamtofastq works with BAM files output by the following 10x Genomics pipelines:

cellranger (see exceptions below)
cellranger-atac
cellranger-arc
cellranger-dna
spaceranger
longranger

Download and Installation

bamtofastq is available for Linux and is compatible with RedHat/CentOS 5.2 or later, and Ubuntu 8.04 or later.

bamtofastq comes pre-bundled with the latest versions of the Cell Ranger, Cell Ranger ARC, Cell Ranger ATAC, and Space Ranger pipelines. You can find the executable in the /lib/bin folder within the installation, e.g. cellranger-x.y.z/lib/bin/bamtofastq. Once your pipeline is installed and on your PATH, you can run [pipeline] bamtofastq, (e.g. cellranger bamtofastq, spaceranger bamtofastq).
You can also download bamtofastq from github.

bamtofastq is a single executable that can be run directly and requires no compilation or installation. Place the executable file in a directory that is on your PATH, and make sure to chmod 700 to make it executable.

Running the Tool

10x Genomics BAMs produced by Cell Ranger v1.2+, Cell Ranger ATAC v1.0+, Cell Ranger ARC v1.0+, Space Ranger v1.0+, Cell Ranger DNA v1.0+, and Long Ranger v2.1+ contain header fields that permit automatic conversion to the correct FASTQ sequences. BAMs produced by older 10x Genomics pipelines may require special arguments or have some caveats, see known issues for details.

The FASTQ files output by bamtofastq contain the same set of sequences that were input to the original pipeline run, although the original order will not be preserved. 10x Genomics pipelines are generally insensitive to the order of the input data, so you can expect nearly identical results when re-running pipelines with bamtofastq outputs.

The usage of the bamtofastq command is as follows:

bamtofastq [options] [bam-path] [output-path]

Replace [bam-path] with the path to the input BAM, [output-path] with the path to the output directory, and include any relevant [options] (listed below). For example, if your BAM file is located at /path/to/mydata.bam, you want the output in /path/to/home/directory, and you want to use eight threads, the command would be:

bamtofastq --nthreads=8 /path/to/mydata.bam /path/to/home/directory

You can also print the options from the command line:

bamtofastq --help

Run times for full-coverage WGS BAMs may be several hours.

Options

Option	Description
--nthreads=`n`	Threads to use for reading BAM file. Default: 4
--locus=`locus`	Optional. Only include read pairs mapping to locus. Use `chrom:start-end` format.
--reads-per-fastq=`N`	Number of reads per FASTQ chunk. Default: 50000000
--gemcode	Convert a BAM produced from GemCode data (Longranger 1.0 - 1.3)
--lr20	Convert a BAM produced by Longranger 2.0
--cr11	Convert a BAM produced by Cell Ranger 1.0-1.1
--bx-list=`L`	Only include BX values listed in text file `L`. Requires BX-sorted and indexed BAM file (see Long Ranger support for details).
--help	Show the help screen

Known Issues and Limitations

BAMs produced for TCR or BCR data, by aligning to a V(D)J reference with cellranger vdj or cellranger multi, are not supported by bamtofastq.
Special tags included by 10x Genomics pipelines are required to reconstruct the original FASTQ sequences correctly. If your BAM file lacks the appropriate headers, you will get an error message: WARNING: no @RG (read group) headers found in BAM file.
A common problem is using edited BAM files from SRA that are missing tags produced by Cell Ranger.
You need the BAM file from the section labelled Original Format on the SRA downloads page.
Here is an example of a SRA dataset with the Original Format BAM available.
Only the Original Format BAM files have the necessary BAM tags preserved to work correctly with bamtofastq.
The latest versions of cellranger, cellranger-atac, cellranger-arc, cellranger-dna and longranger generate BAM files that automatically reconstruct complete FASTQ files representing all input reads. BAMs produced by older versions of cellranger and longranger have some caveats, listed below:

Package	Version	Pipelines	Extra Arguments	Complete FASTQs
Cell Ranger	1.3+	count	none	Yes
Cell Ranger	1.2	count	none	Reads without a valid barcode will be absent from FASTQ. (These reads are ignored by Cell Ranger)
Cell Ranger	1.0-1.1	count	`--cr11`	Reads without a valid barcode will be absent from FASTQ. (These reads are ignored by Cell Ranger)
Cell Ranger ARC	1.0.0+	count	none	Any sequenced bases in the i5 index read that are not part of the 10x Genomics barcode are dropped from the FASTQ output.
Cell Ranger ATAC	1.0.0+	count	none	Any sequenced bases in the i5 index read that are not part of the 10x Genomics barcode are dropped from the FASTQ output.
Cell Ranger DNA	1.0.0+	cnv	none	Yes
Long Ranger	2.1.3+	wgs, targeted, align, basic	none	Yes
Long Ranger	2.1.0 - 2.1.2	wgs, targeted	none	Yes
Long Ranger	2.0	wgs, targeted	`--lr20`	Yes
Long Ranger	2.0.0 - 2.1.2	align, basic	Not Supported	N/A
Long Ranger	1.3 (GemCode)	wgs, targeted	`--gemcode`	Reads without a valid barcode will be absent from FASTQ. This will result in a ~5-10% loss of coverage.

10x Genomics

Converting 10x Genomics BAM Files to FASTQ

Background

Download and Installation

Running the Tool

Options

Known Issues and Limitations

About

Legal Notices

Resources

Headquarters

Social