Software  ›   pipelines

# Barcoded BAM

The spaceranger count pipeline outputs an indexed BAM file containing position-sorted reads aligned to the genome and transcriptome. Reads aligned to the transcriptome across exon junctions in the genome have a large gap in its CIGAR string, such as 35M225N64M. Each read in this BAM file has Visium spot and molecular barcode information attached. Space Ranger modifies MAPQ values; see the MM tag below. The following assumes basic familiarity with the BAM format. More details on the the SAM/BAM standard are available online.

If the BAM files are not required, you can instruct spaceranger count to skip BAM file creation by providing the --no-bam option. See the count documention here for more information.

## Visium Spatial Gene Expression for FFPE

Some aspects of the BAM file are particular to formalin fixed paraffin embedded (FFPE) samples.

A read that multimaps to the reference transcriptome, but maps uniquely to a single probe, is confidently mapped to that probe and its gene. The mapping quality has the following meaning:

MAPQDescription
255Both read halves map to the same probe.
3Each read half maps to a different probe.
1One read half maps to a probe and the other half does not.
0Neither read half maps to a probe.

The BAM tag pr:Z reports a semicolon-separated list of probe IDs. See below for a detailed description.

## BAM Barcode Tags

Visium spot and molecular barcode information for each read is stored as TAG fields:

TagTypeDescription
CBZVisium spot barcode sequence that is error-corrected and confirmed against a list of known-good barcode sequences.
CRZVisium spot barcode sequence as reported by the sequencer.
CYZVisium spot barcode read quality. Phred scores as reported by sequencer.
UBZVisium molecular barcode sequence that is error-corrected among other molecular barcodes with the same spot barcode and gene alignment.
URZVisium molecular barcode sequence as reported by the sequencer.
UYZVisium molecular barcode read quality. Phred scores as reported by sequencer.
BCZSample index read.
QTZSample index read quality. Phred scores as reported by sequencer.
TRZTrimmed sequence. Trailing sequence (if any) following the spot and molecular barcodes on Read 1.

The spot barcode CB tag includes a suffix with a dash separator followed by a number:

AGAATGGTCTGCAT-1

This number will always be one (1) in the current Space Ranger output.

## BAM Alignment Tags

The following tags are also present on reads that mapped to the genome and overlapped an exon by at least one base pair. A read may align to multiple transcripts and genes, but it is only considered confidently mapped to the transcriptome if it mapped to a single gene.

TagTypeDescription
TXZPresent in reads aligned to the same strand as the transcripts in this semicolon-separated list that are compatible with this alignment. Transcripts are specified with the transcript_id key in the reference GTF attribute column. The format of each entry is [transcript_id],[strand][pos],[cigar], where strand is either + or -, pos is the alignment offset in transcript coordinates, and cigar is the CIGAR string in transcript coordinates.
ANZSame as the TX tag, but for reads that are aligned to the antisense strand of annotated transcripts.
GXZSemicolon-separated list of gene IDs that are compatible with this alignment. Gene IDs are specified with the gene_id key in the reference GTF attribute column.
GNZSemicolon-separated list of gene names that are compatible with this alignment. Gene names are specified with gene_name key in the reference GTF attribute column.
MMiSet to 1 if the genome-aligner (STAR) originally gave a MAPQ < 255 (it multi-mapped to the genome) and Space Ranger changed it to 255 because the read overlapped exactly one gene.
REASingle character indicating the region type of this alignment (E = exonic, N = intronic, I = intergenic).
prZFor Visium FFPE, a semicolon-separated list of probe IDs: one probe ID if both read halves align to the same probe, and two probe IDs if each read half aligns to a different probe, or NA if a read half does not align to a probe.
paiThe number of poly-A nucleotides trimmed from the 3' end of read 2. Up to 10% mismatches are permitted.
tsiThe number of template switch oligo (TSO) nucleotides trimmed from the 5' end of read 2. Up to 3 mismatches are permitted. The 30-bp TSO sequence is AAGCAGTGGTATCAACGCAGAGTACATGGG.
xfiExtra alignment flags. The bits of this tag are interpreted as follows:
• 1 - The read is confidently mapped to a feature.
• 4 - This read pair maps to a discordant pair of genes, and is not treated as a UMI count.
• 2 - The read maps to a feature that the majority of other reads with this UMI did not.
• 8 - This read is representative for the molecule and can be treated as a UMI count.
• 16 - This read maps to exactly one feature, and is identical to bit 8 for transcriptomic reads. Notably, this bit is set when a feature barcode read is treated as a UMI count, while bit 8 is not.
• 32 - This read was not analyzed due to high sequencing depth and subsampling for Targeted Spatial Gene Expression.