HOME  ›   pipelines
If your question is not answered here, please email us at:  ${email.software}

Cell Ranger


Loupe

10x Genomics
Chromium Single Cell Immune Profiling

Barcoded BAMs

Table of Contents

File formats and descriptions

The cellranger vdj pipeline outputs several indexed BAM files. These files are primarily provided for use with a BAM visualization tool such as the Integrated Genome Viewer (IGV).

File Records Reference Description
all_contig.bam Reads Assembled contigs Please note that this file is not an archive of every single input read. The all_contig.bam serves as a starting point to demonstrate how reads and UMIs support the assembled contigs within a cell barcode. It contains 80,000 randomly subsampled reads per barcode, and due to subsampling, not all valid barcodes are retained. Among these subsampled reads, some may not map to assembled contigs. Reads are not aligned across cell barcode boundaries. Also, reads with barcodes that do not match the barcode whitelist are excluded.
all_contig.bam.bai Index Companion file to the all_contig.bam that serves as an external index.
consensus.bam Contigs Clonotype consensus Each "reference" sequence is a clonotype consensus sequence, and each record is an alignment of a single cell's contig against this consensus. This file shows, for a clonotype consensus sequences, how the constituent per-cell assemblies support the consensus.
consensus.bam.bai Index Companion file to the consensus.bam that serves as an external index.
concat_ref.bam Contigs Concatenated germline segments For each clonotype consensus, the reference sequence is the annotated germline segments concatenated together. This file shows how both the per-cell contigs and the clonotype consensus contig relate to the germline reference. Useful for revealing polymorphisms, somatic mutations, and recombination-induced differences such as non-templated nucleotide additions.
concat_ref.bam.bai Index Companion file to the concat_ref.bam that serves as an external index.

The following sections require some familiarity with the SAM/BAM format.

BAM barcode tags

Chromium cell barcode and UMI information for each read is stored as TAG fields:

TagTypeDescription
CBZChromium cell barcode sequence that is error-corrected and confirmed against a list of known-good barcode sequences.
CRZChromium cell barcode sequence as reported by the sequencer.
CYZChromium cell barcode read quality. Phred scores as reported by sequencer.
UBZChromium UMI sequence that is error-corrected among other UMIs with the same cell barcode and gene alignment.
URZChromium UMI sequence as reported by the sequencer.
UYZChromium UMI read quality. Phred scores as reported by sequencer.
BCZSample index read.
QTZSample index read quality. Phred scores as reported by sequencer.

The cell barcode CB tag includes a suffix with a dash separator followed by a number:

AGAATGGTCTGCAT-1

This number denotes the GEM group and is used to virtualize barcodes in order to achieve a higher effective barcode diversity when combining samples generated from separate GEM well runs. For samples run though a single GEM chip, the hyphenated number should be "1" across all barcodes . It can either be ignored (treated as part of a unique barcode identifier) or explicitly parsed out for downstream analysis.

BAM CIGAR string

The cellranger vdj pipeline uses the = and X CIGAR string operations to indicate matches and mismatches, respectively. This contrasts with most aligners which simply report M for match/mismatch. The SAM/BAM standard supports both CIGAR formats. For more details please refer to the SAM/BAM standard.

Next steps