Cell Ranger2.0, printed on 11/21/2024
The cellranger vdj pipeline outputs several indexed BAM files. These files are primarily provided for use with a BAM visualization tool such as the Integrated Genome Viewer (IGV).
File | Records | Reference | Description |
---|---|---|---|
all_contig.bam |
Reads | Assembled contigs | Reads aligned to assembled contigs, per cell-barcode. This file demonstrates how the reads and UMIs support the assembled contigs within a cell-barcode. Reads are not aligned across cell-barcode boundaries. Please note that this BAM excludes reads whose barcodes don't match the whitelist, so it is not suitable as an archive of every single input read. |
consensus.bam |
Contigs | Clonotype consensus | Each "reference" sequence is a clonotype consensus sequence, and each record is an alignment of a single cell's contig against this consensus. This file shows, for a clonotype consensus sequences, how the constituent per-cell assemblies support the consensus. |
concat_ref.bam |
Contigs | Concatenated germline segments | Each reference sequence is, for each clonotype consensus, the annotated germline segments concatenated together. This file shows how both the per-cell contigs and the clonotype consensus contig relate to the germline reference. This file is expected to reveal polymorphisms, somatic mutations, and recombination-induced differences such as non-templated nucleotide additions. |
The following assumes basic familiarity with the BAM format. More details on the SAM/BAM standard are available online.
Chromium cellular and molecular barcode information for each read is stored as TAG fields:
Tag | Type | Description |
---|---|---|
CB | Z | Chromium cellular barcode sequence that is error-corrected and confirmed against a list of known-good barcode sequences. |
CR | Z | Chromium cellular barcode sequence as reported by the sequencer. |
CY | Z | Chromium cellular barcode read quality. Phred scores as reported by sequencer. |
UB | Z | Chromium molecular barcode sequence that is error-corrected among other molecular barcodes with the same cellular barcode and gene alignment. |
UR | Z | Chromium molecular barcode sequence as reported by the sequencer. |
UY | Z | Chromium molecular barcode read quality. Phred scores as reported by sequencer. |
BC | Z | Sample index read. |
QT | Z | Sample index read quality. Phred scores as reported by sequencer. |
The cell barcode CB
tag includes a suffix with a dash separator followed by a number:
AGAATGGTCTGCAT-1
This number denotes what we call a GEM group, and is used to virtualize barcodes in order to achieve a higher effective barcode diversity when combining samples generated from separate GEM chip channel runs. Normally, this number will be "1" across all barcodes when analyzing a sample generated from a single GEM chip channel. It can either be left in place and treated as part of a unique barcode identifier, or explicitly parsed out to leave only the barcode sequence itself.
The cellranger vdj pipeline uses the =
and X
CIGAR string operations to indicate matches and mismatches, respectively. This contrasts with most aligners which simply report M
for match/mismatch. The SAM/BAM standard supports both CIGAR formats. For more details please refer to the SAM/BAM standard.