Cell Ranger ATAC2.1, printed on 11/17/2024
The cellranger-atac count pipeline outputs summary.csv
which contains a number of key metrics in comma separated values (CSV) text format. Below are the
definitions of the reported metrics. In multi-species experiments, when the metric is species-specific, a
value is reported for each species. When a metric
is not computed by the pipeline, likely because of insufficient information or
division by zero, it is not reported in the summary.csv
.
Metric | Description | Is Species Specific |
---|---|---|
Sample ID |
Value of --id passed to the pipeline. |
FALSE |
Genome |
Reference genome used for the analysis. | FALSE |
Estimated number of cells |
The total number of barcodes identified as cells. | TRUE |
Confidently mapped read pairs |
Fraction of sequenced read pairs with mapping quality > 30 | FALSE |
Estimated bulk library complexity |
Estimated complexity of the library given the observed unique read pairs when sequenced to current depth. | FALSE |
Fraction of genome in peaks |
Fraction of bases in primary contigs that are defined as peaks. | FALSE |
Fraction of high-quality fragments in cells |
Fraction of high-quality fragments with a valid barcode that are associated with cell-containing partitions. High-quality fragments are defined as read pairs with a valid barcode that map to the nuclear genome with mapping quality > 30, are not chimeric and not duplicate. | FALSE |
Fraction of high-quality fragments overlapping TSS |
Fraction of high-quality fragments in cell barcodes that overlap transcription start sites (TSS). | TRUE |
Fraction of high-quality fragments overlapping peaks |
Fraction of high-quality fragments in cell barcodes that overlap called peaks. | TRUE |
Fraction of transposition events in peaks in cells |
Fraction of transposition events that are associated with cell-containing partitions and fall within peaks. Transposition events are located at both ends of all high-quality fragments. This metric measures the percentage of such events that overlap with peaks. | FALSE |
Fragments flanking a single nucleosome |
Fraction of high-quality fragments between 147 and 294 basepairs. | FALSE |
Fragments in nucleosome-free regions |
Fraction of high-quality fragments smaller than 147 basepairs. | FALSE |
Inferred multiplet rate |
The estimated fraction of cell barcodes containing more than one cell. | FALSE |
Mean raw read pairs per cell |
Total number of read pairs divided by the number of cell barcodes | TRUE |
Median barcode purity |
The median, across all cell barcodes, of the fraction of fragments in the barcode that align uniquely to the species assigned to the barcode. | TRUE |
Median high-quality fragments per cell |
The median number of high-quality fragments per cell barcode | TRUE |
Non-nuclear read pairs |
Fraction of sequenced read pairs that have a valid barcode and map to non-nuclear genome contigs, including mitochondria,with mapping quality > 30. | FALSE |
Number of peaks |
Total number of peaks on primary contigs either detected by the pipeline or input by the user. | FALSE |
Observed multiplet rate |
The observed fraction of cell barcodes that appear to have cells from both species present. | FALSE |
Percent duplicates |
Fraction of high-quality read pairs that are deemed to be PCR duplicates. A high-quality read-pair is one with mapping quality > 30, that is not chimeric and maps to nuclear contigs. This metric is a measure of sequencing saturation and is a function of library complexity and sequencing depth. More specifically, this is the fraction of high-quality fragments with a valid barcode that align to the same genomic position as another read pair in the library. | FALSE |
Post-Normalization median unique fragments per cell in library |
The median unique fragments per cell barcode in the library after normalization | TRUE |
Post-Normalization total mapped read pairs |
The total fragments (mapped read pairs) after normalization | FALSE |
Post-Normalization total mapped read pairs in library |
The total fragments (mapped read pairs) in the library after normalization | TRUE |
Post-Normalization unique fragments in library |
The unique fragments (mapped read pairs) in the library after normalization | TRUE |
Pre-Normalization median unique fragments per cell in library |
The median unique fragments per cell barcode in the library before normalization | TRUE |
Pre-Normalization total mapped read pairs |
The total fragments (mapped read pairs) before normalization | FALSE |
Pre-Normalization total mapped read pairs in library |
The total fragments (mapped read pairs) in the library before normalization | TRUE |
Pre-Normalization unique fragments in library |
The unique fragments (mapped read pairs) in the library before normalization | TRUE |
Q30 bases in barcode |
Fraction of barcode read (i5 index read) bases with Q-score >= 30. | FALSE |
Q30 bases in read 1 |
Fraction of read 1 bases with Q-score >= 30. | FALSE |
Q30 bases in read 2 |
Fraction of read 2 bases with Q-score >= 30. | FALSE |
Q30 bases in sample index i1 |
Fraction of sample index read (i7 index read) bases with Q-score >= 30. | FALSE |
Sequenced read pairs |
Total number of sequenced read pairs assigned to the sample. | FALSE |
Sequencing saturation |
Estimated sequencing saturation of high-quality fragment pool. Computed as the ratio of observed unique read pairs to estimated library complexity. | FALSE |
TSS enrichment score |
Maximum value of the transcription-start-site (TSS) profile.The TSS profile is the summed accessibility signal (defined as number of cut sites per base) in a window of 2,000 bases around all the annotated TSSs, normalized by the minimum signal in the window. | FALSE |
Unmapped read pairs |
Fraction of sequenced read pairs that have a valid barcode but could not be mapped to the genome | FALSE |
Valid barcodes |
Fraction of read pairs with barcodes that match the whitelist after error correction. | FALSE |
The cellranger-atac aggr pipeline outputs summary.json
which contains metrics relating to the aggregated datasets. Note: brackets
denote a variable that depends on the pipeline input, e.g.
post_norm_median_frags_per_cell_Library_{library}
means that if your
aggregation contains two libraries with IDs sample123 and sample456, there will
be two output metrics: post_norm_median_frags_per_cell_Library_sample123
and
post_norm_median_frags_per_cell_Library_sample456
.
Metric | Description | Is Species Specific |
---|---|---|
annotated_cells | Estimated number of cells. | True |
cellranger-atac_version | Software version used to run the pipeline. | False |
frac_cut_fragments_in_peaks | Fraction of transposition events in peaks. | False |
frac_fragments_overlapping_peaks | Fraction of fragments overlapping called peaks. | True |
frac_fragments_overlapping_targets | Fraction of fragments overlapping any targeted region. | True |
median_fragments_per_cell | Median fragments per cell barcode. | True |
post_norm_median_frags_per_cell_Library_{} | Post-Normalization median unique fragments per cell in library | False |
pre_norm_median_frags_per_cell_Library_{} | Pre-Normalization median unique fragments per cell in library | False |
total_post_normalization_Library_{} | Post-Normalization total mapped read pairs in library | False |
total_pre_normalization_Library_{} | Pre-Normalization total mapped read pairs in library | False |
unique_post_normalization_Library_{} | Post-Normalization unique fragments in library | False |
unique_pre_normalization_Library_{} | Pre-Normalization unique fragments in library | False |
total_post_normalization | Post-Normalization total mapped read pairs | False |
total_pre_normalization | Pre-Normalization total mapped read pairs | False |