This is documentation for the GemCode System.  Click here for Chromium System documentation.
HOME  ›   pipelines
If your question is not answered here, please email us at:  ${}

10x Genomics
GemCode Genome & Exome

Barcoded BAM

The longranger run pipeline outputs an indexed BAM file containing position-sorted, aligned reads. Each read in this BAM file has GemCode barcode and phasing information attached. The following assumes basic familiarity with the BAM format. More details on the the SAM/BAM standard are available online.

BAM Barcode Tags

GemCode barcode information for each read is stored as TAG fields:

BXZGemCode barcode sequence that is error-corrected and confirmed against a list of known-good barcode sequences. Use this for analysis.
BCZSample index (I7) read.
QTZSample index (I7) read quality. Phred scores as reported by sequencer.

The BX tag includes a suffix with a dash separator followed by a number:


This number denotes what we call a GEM group, and is used to virtualize barcodes in order to achieve a higher effective barcode diversity when combining samples generated from separate GEM chip channel runs. Normally, this number will be "1" across all barcodes when analyzing a sample generated from a single GEM chip channel. It can either be left in place and treated as part of a unique barcode identifier, or explicitly parsed out to leave only the barcode sequence itself.

BAM Phasing Tags

The following tags will also be present on reads that were confidently assigned to a haplotype.

PSZPhase set containing this read
HPiHaplotype of the molecule that generated the read
MIiGlobal molecule identifier for molecule that generated this read

Phase sets, defined in the VCF standard, are regions within which identified haplotypes are mutually consistent. As a result, HP tags are only comparable between reads that share a common PS. By definition, adjacent phase sets lack sufficient Linked-Reads to determine the relationship between their haplotypes.