Cell Ranger2.1, printed on 08/09/2022
The cellranger vdj pipeline outputs several indexed BAM files. These files are primarily provided for use with a BAM visualization tool such as the Integrated Genome Viewer (IGV).
||Reads||Assembled contigs||Reads aligned to assembled contigs, per cell-barcode. This file demonstrates how the reads and UMIs support the assembled contigs within a cell-barcode. Reads are not aligned across cell-barcode boundaries. Please note that this BAM excludes reads whose barcodes don't match the whitelist, so it is not suitable as an archive of every single input read.|
||Contigs||Clonotype consensus||Each "reference" sequence is a clonotype consensus sequence, and each record is an alignment of a single cell's contig against this consensus. This file shows, for a clonotype consensus sequences, how the constituent per-cell assemblies support the consensus.|
||Contigs||Concatenated germline segments||Each reference sequence is, for each clonotype consensus, the annotated germline segments concatenated together. This file shows how both the per-cell contigs and the clonotype consensus contig relate to the germline reference. This file is expected to reveal polymorphisms, somatic mutations, and recombination-induced differences such as non-templated nucleotide additions.|
The following assumes basic familiarity with the BAM format. More details on the SAM/BAM standard are available online.
Chromium cellular and molecular barcode information for each read is stored as TAG fields:
|Z||Chromium cellular barcode sequence that is error-corrected and confirmed against a list of known-good barcode sequences.|
|Z||Chromium cellular barcode sequence as reported by the sequencer.|
|Z||Chromium cellular barcode read quality. Phred scores as reported by sequencer.|
|Z||Chromium molecular barcode sequence that is error-corrected among other molecular barcodes with the same cellular barcode and gene alignment.|
|Z||Chromium molecular barcode sequence as reported by the sequencer.|
|Z||Chromium molecular barcode read quality. Phred scores as reported by sequencer.|
|Z||Sample index read.|
|Z||Sample index read quality. Phred scores as reported by sequencer.|
The cell barcode
CB tag includes a suffix with a dash separator followed by a number:
This number denotes what we call a GEM group, and is used to virtualize barcodes in order to achieve a higher effective barcode diversity when combining samples generated from separate GEM chip channel runs. Normally, this number will be "1" across all barcodes when analyzing a sample generated from a single GEM chip channel. It can either be left in place and treated as part of a unique barcode identifier, or explicitly parsed out to leave only the barcode sequence itself.
The cellranger vdj pipeline uses the
X CIGAR string operations to indicate matches and mismatches, respectively. This contrasts with most aligners which simply report
M for match/mismatch. The SAM/BAM standard supports both CIGAR formats. For more details please refer to the SAM/BAM standard.