Space Ranger1.0 (latest), printed on 05/26/2020
The spaceranger pipeline outputs an indexed BAM file containing position-sorted reads aligned to the genome and transcriptome. Reads aligned to the transcriptome across exon junctions in the genome have a large gap in its CIGAR string i.e. 35M225N64M. Each read in this BAM file has Visium cellular and molecular barcode information attached. Space Ranger modifies MAPQ values; see the MM tag below. The following assumes basic familiarity with the BAM format. More details on the the SAM/BAM standard are available online.
Visium spot and molecular barcode information for each read is stored as TAG fields:
|Z||Visium spot barcode sequence that is error-corrected and confirmed against a list of known-good barcode sequences.|
|Z||Visium spot barcode sequence as reported by the sequencer.|
|Z||Visium spot barcode read quality. Phred scores as reported by sequencer.|
|Z||Visium molecular barcode sequence that is error-corrected among other molecular barcodes with the same spot barcode and gene alignment.|
|Z||Visium molecular barcode sequence as reported by the sequencer.|
|Z||Visium molecular barcode read quality. Phred scores as reported by sequencer.|
|Z||Sample index read.|
|Z||Sample index read quality. Phred scores as reported by sequencer.|
|Z||Trimmed sequence. Trailing sequence (if any) following the cell and molecular barcodes on Read 1.|
|i||Extra alignment flags. The bit flags can be interpreted as follows: 1 - The read is confidently mapped to a feature; 2 - The read maps to a feature that the majority of other reads with this UMI did not; 8 - This read is representative for the molecule and can be treated as a UMI count. Bits 4, 16 and 32 are used internally by 10X.|
The spot barcode
CB tag includes a suffix with a dash separator followed by a number:
This number will always be one (1) in the current Space Ranger output.
The following tags will also be present on reads that mapped to the genome and overlapped an exon by at least one base pair. A read may align to multiple transcripts and genes, but it is only considered confidently mapped to the transcriptome it if mapped to a single gene.
|Z||Present in reads aligned to the same strand as the transcripts in this semicolon-separated list that are compatible with this alignment. Transcripts are specified with the |
|Z||Same as the TX tag, but for reads that are aligned to the antisense strand of annotated transcripts.|
|Z||Semicolon-separated list of gene IDs that are compatible with this alignment. Gene IDs are specified with the |
|Z||Semicolon-separated list of gene names that are compatible with this alignment. Gene names are specified with |
|i||Set to 1 if the genome-aligner (STAR) originally gave a MAPQ < 255 (it multi-mapped to the genome) and Space Ranger changed it to 255 because the read overlapped exactly one gene.|
|A||Single character indicating the region type of this alignment (E = exonic, N = intronic, I = intergenic).|