Cell Ranger ARC1.0, printed on 12/21/2024
Cell Ranger ARC analyzes data generated by the Chromium Single Cell Multiome ATAC + Gene Expression assay to compute sequencing quality and application result metrics for each supported library type. Supported library types are "Gene Expression" and "Chromatin Accessibility". The pipeline outputs key metrics in summary.csv
and below are the definitions of the reported metrics.
Metric | Description |
---|---|
Estimated number of cells | The number of barcodes associated with cell-containing partitions. |
Feature linkages detected | Total number of gene-to-peak and peak-to-peak linkages detected. |
Linked genes | Total number of genes that are linked to peaks. |
Linked peaks | Total number of peaks that are linked to genes or other peaks. |
Metrics in this table have ATAC
prefix in the pipeline output.
Metric | Description |
---|---|
Confidently mapped read pairs | Fraction of sequenced read pairs with mapping quality > 30. |
Fraction of genome in peaks | Fraction of bases in primary contigs (contigs containing genes) that are defined as peaks. |
Fraction of high-quality fragments in cells | Fraction of high-quality fragments with a valid barcode that are associated with cell-containing partitions. High-quality fragments are defined as read pairs with a valid barcode that map to the nuclear genome with mapping quality > 30, are not chimeric and not duplicate. |
Fraction of high-quality fragments overlapping peaks | Fraction of high-quality fragments in cell barcodes that overlap called peaks. |
Fraction of high-quality fragments overlapping TSS | Fraction of high-quality fragments in cell barcodes that overlap transcription start sites (TSS). |
Fraction of transposition events in peaks in cells | Fraction of transposition events that are associated with cell-containing partitions and fall within peaks. Transposition events are located at both ends of all high-quality fragments. This metric measures the percentage of such events that overlap with peaks. |
Mean raw read pairs per cell | Total number of read pairs divided by the number of cell barcodes. |
Median high-quality fragments per cell | The median number of high-quality fragments per cell barcode. |
Non-nuclear read pairs | Fraction of sequenced read pairs that have a valid barcode and map to non-nuclear genome contigs, including mitochondria, with mapping quality > 30. |
Number of peaks | Total number of peaks on primary contigs either detected by the pipeline or input by the user. |
Percent duplicates | Fraction of high-quality read pairs that are deemed to be PCR duplicates. This metric is a measure of sequencing saturation and is a function of library complexity and sequencing depth. More specifically, this is the fraction of high-quality fragments with a valid barcode that align to the same genomic position as another read pair in the library. |
Q30 bases in barcode | Fraction of barcode read (i2) bases with Q-score ≥ 30. |
Q30 bases in read 1 | Fraction of read 1 bases with Q-score ≥ 30. |
Q30 bases in read 2 | Fraction of read 2 bases with Q-score ≥ 30. |
Q30 bases in sample index i1 | Fraction of sample index read (i1) bases with Q-score ≥ 30. |
Sequenced read pairs | Total number of sequenced read pairs assigned to the sample. |
TSS enrichment score | Maximum value of the transcription-start-site (TSS) profile.The TSS profile is the summed accessibility signal (defined as number of cut sites per base) in a window of 2,000 bases around all the annotated TSSs, normalized by the minimum signal in the window. |
Unmapped read pairs | Fraction of sequenced read pairs that have a valid barcode but could not be mapped to the genome. |
Valid barcodes | Fraction of read pairs with barcodes that match the whitelist after error correction. |
Metrics in this table have GEX
prefix in the pipeline output.
Metric | Description |
---|---|
Fraction of transcriptomic reads in cells | Fraction of transcriptomic reads with a valid barcode that are associated with cell-containing partitions. Transcriptomic reads are defined as reads with mapping quality = 255 that map to a unique gene, including intronic alignments (default mode). When excluding introns the transcriptome is restricted to alignments that are consistent with annotated splice junctions. Note that transcriptomic reads include UMI reads, duplicate reads, and reads marked as low-support molecules. |
Mean raw reads per cell | Total number of reads divided by the number of cell barcodes. |
Median genes per cell | The median number of genes detected per cell barcode. Detection is defined as the presence of at least one UMI count. |
Median UMI counts per cell | The median number of UMI counts per cell barcode. |
Percent duplicates | The fraction of reads originating from an already-observed UMI. This is a function of library complexity and sequencing depth. More specifically, this is the fraction of confidently mapped, valid barcode, valid UMI reads that have a non-unique (barcode, UMI, gene). |
Q30 bases in barcode | Fraction of barcode bases with Q-score ≥ 30, excluding very low quality/no-call (Q ≤ 2) bases from the denominator. |
Q30 bases in read 2 | Fraction of RNA read bases with Q-score ≥ 30, excluding very low quality/no-call (Q ≤ 2) bases from the denominator. |
Q30 bases in sample index i1 | Fraction of sample index bases (i1) with Q-score ≥ 30, excluding very low quality/no-call (Q ≤ 2) bases from the denominator. |
Q30 bases in sample index i2 | Fraction of sample index bases (i2) with Q-score ≥ 30, excluding very low quality/no-call (Q ≤ 2) bases from the denominator. |
Q30 bases in UMI | Fraction of UMI bases with Q-score ≥ 30, excluding very low quality/no-call (Q ≤ 2) bases from the denominator. |
Reads mapped antisense to gene | Fraction of reads that map to the transcriptome with MAPQ 255, but on the opposite strand of one or more overlapping annotated genes. |
Reads mapped confidently to exonic regions | Fraction of sequenced reads that map uniquely to an exonic region of the genome. |
Reads mapped confidently to genome | Fraction of sequenced reads that map uniquely to the genome. If a gene mapped to an exonic loci from a single gene and also to a non-exonic loci, it is considered uniquely mapped to one of the exonic loci. |
Reads mapped confidently to intergenic regions | Fraction of sequenced reads that map uniquely to an intergenic region of the genome. |
Reads mapped confidently to intronic regions | Fraction of sequenced reads that map uniquely to an intronic region of the genome. |
Reads mapped confidently to transcriptome | Fraction of sequenced reads that map to a unique gene in the transcriptome with mapping quality = 255. In the default mode the transcriptome includes intronic alignments. When excluding introns the transcriptome is restricted to alignments that are consistent with annotated splice junctions. Note that transcriptomic reads include UMI reads, duplicate reads, and reads marked as low-support UMIs. |
Reads mapped to genome | Fraction of sequenced reads that map to the genome. |
Reads with TSO | Fraction of reads with an alignment score of ≥ 20 for the template switch oligo (TSO) sequence. |
Sequenced read pairs | Total number of sequenced read pairs assigned to the sample. |
Total genes detected | The number of genes with at least one UMI count in any cell barcode. |
Valid barcodes | Fraction of read pairs with barcodes that match the whitelist after error correction. |
Valid UMIs | Fraction of read pairs with valid UMIs i.e. without Ns and are not homopolymers. |