10x Genomics
Chromium Single Cell ATAC

Cell Ranger ATAC2.1, printed on 04/16/2025

Cell Ranger ATAC count: Per Barcode QC, ATAC Signal, and Cell Calling

Cell Ranger ATAC performs cell calling where it determines whether each barcode is a cell of any species included in the reference. Based on mapping information, Cell Ranger ATAC also provides QC information associated with the fragments per barcode, including the ATAC signal per barcode, captured by various targeting metrics such as number of fragments overlapping transcription start sites (TSS) annotated in the reference package. All of this per barcode information is collated and produced in a single output table: singlecell.csv.

Structure

The structure and contents of singlecell.csv from a single species analysis are shown below:

$ cd /home/jdoe/runs/sample345/outs
$ head -5 singlecell.csv
barcode,total,duplicate,chimeric,unmapped,lowmapq,mitochondrial,nonprimary,passed_filters,is_cell_barcode,excluded_reason,TSS_fragments,DNase_sensitive_region_fragments,enhancer_region_fragments,promoter_region_fragments,on_target_fragments,blacklist_region_fragments,peak_region_fragments,peak_region_cutsites
NO_BARCODE,986507,102223,401,118334,63547,1882,324,699796,0,0,0,0,0,0,0,0,0,0
AAACGAAAGAAAGGGT-1,8,0,0,5,0,0,0,3,0,0,1,0,0,0,1,0,1,2
AAACGAAAGAAATACC-1,7,2,0,3,0,0,0,2,0,2,0,0,0,0,0,0,0,0
AAACGAAAGAAATGGG-1,10,4,0,1,1,0,0,4,0,0,0,0,0,0,0,0,1,2

The table contains many columns, including the primary barcode column. All the barcodes in the dataset are listed in this column. The NO_BARCODE row contains a summary of fragments that are not associated with any whitelisted barcodes. These fragments usually form a small fraction of all reads.

Column Definitions

Column	Type	Description	Pipeline specific changes	Reference specific changes
`barcode`	key	barcodes present in input data
`total`	sequencing	total read-pairs	absent in `aggr`, `reanalyze`
`duplicate`	mapping	number of duplicate read-pairs
`chimeric`	mapping	number of chimerically mapped read-pairs	absent in `aggr`, `reanalyze`
`unmapped`	mapping	number of read-pairs with at least one end not mapped	absent in `aggr`, `reanalyze`
`lowmapq`	mapping	number of read-pairs with <30 mapq on at least one end	absent in `aggr`, `reanalyze`
`mitochondrial`	mapping	number of read-pairs mapping to mitochondria and non-nuclear contigs	absent in `aggr`, `reanalyze`
`nonprimary`	mapping	the number of reads that map to non-primary contigs
`passed_filters`	mapping	number of non-duplicate, usable read-pairs i.e. "fragments"	absent in `aggr`, `reanalyze`	for multi species, for example hg19 and mm10, expect additional columns: `passed_filters_hg19` and `passed_filtered_mm10`
`is_cell_barcode`	cell calling	binary indicator of whether barcode is associated with a cell		for multi species, for example hg19 and mm10, expect columns `is_hg19_cell_barcode` and `is_mm10_cell_barcode` instead.
`excluded_reason`	cell calling	0: barcode was not excluded; 1: barcode was excluded because it is a gel bead doublet; 2: barcode was excluded because it is low-targeting; 3: barcode was excluded because it is a barcode multiplet
`TSS_fragments`	targeting	number of fragments overlapping with TSS regions
`DNase_sensitive_region_fragments`	targeting	number of fragments overlapping with DNase sensitive regions		For custom references or references missing the `dnase.bed` file, this count is 0
`enhancer_region_fragments`	targeting	number of fragments overlapping enhancer regions		For custom references or references missing the `enhancer.bed` file, this count is 0
`promoter_region_fragments`	targeting	number of fragments overlapping promoter regions		For custom references or references missing the `promoter.bed` file, this count is 0
`on_target_fragments`	targeting	number of fragments overlapping any of TSS, enhancer, promoter and DNase hypersensitivity sites (counted with multiplicity)		For custom references or references having only the `tss.bed` file, this count is simply equal to the TSS_fragments
`blacklist_region_fragments`	targeting	number of fragments overlapping blacklisted regions
`peak_region_fragments`	denovo targeting	number of fragments overlapping peaks		for multi species, for example hg19 and mm10, expect additional columns: `peak_region_fragments_hg19` and `peak_region_fragments_mm10`
`peak_region_cutsites`	denovo targeting	number of ends of fragments in peak regions

Note that the number of columns and the column names themselves change and depend on what pipeline and what reference was used to generate the output file. Briefly, as described in the last two columns in the table:

Cell Ranger ATAC aggr and reanalyze pipelines only take fragments as input and not fastqs. Consequently, only the barcodes present in the input fragments file, i.e. barcodes with at least one fragment detected will be listed in this output file.
As the Cell Ranger ATAC aggr and reanalyze pipelines don't require the BAM at input, the columns associated with mapping information are not produced for the output file. However, the duplicates, and passed filter information can be deduced from the fragments file.
When present, it is guaranteed that the sum of all the mapping type columns (whatever subset is present) will be equal to the total.
For custom references, if files such as enhancer.bed are missing, then the counts for the corresponding columns will be zero.
For barnyard references, there will be additional species specific columns such as is_hg19_cell_barcodes, passed_filter_hg19, and peak_region_fragments_hg19.

Loading and using singlecell.csv in Python

singlecell.csv can be loaded easily in Python as a pandas dataframe:

import pandas as pd
 
singlecell_file  = "/home/jdoe/runs/sample345/outs/singlecell.csv"
# load without index
scdf = pd.read_csv(singlecell_file, sep=",")
 
# load with barcode as index
scdf2 = pd.read_csv(singlecell_file, sep=",", index_col="barcode" )

You can use this file in many ways. Below are some examples:

Regenerate the targeting plot in web summary

Assume you are analyzing data from a single species library, such as human (hg19). To reproduce the targeting plot on the right side in Targeting section of the websummary, you can do the following:

import matplotlib as plt
cell_mask = (scdf['is__cell_barcode'] == 1)
noncell_mask = (scdf['is__cell_barcode'] != 1 && scdf['barcode'] != 'NO_BARCODE')
plt.plot(scdf[cell_mask]['passed_filters'],
     scdf[cell_mask]['peak_region_fragments'] / scdf[cell_mask]['passed_filters'],
     c='b')
plt.plot(scdf[noncell_mask]['passed_filters'],
     scdf[noncell_mask]['peak_region_fragments'] / scdf[noncell_mask]['passed_filters'],
     c='r')

Edit cell calling for use in aggr and reanalyze

The singlecell.csv file captures the cell calling information in the is_{species}_cell_barcode field. The aggr pipeline requires you to specify the singlecell.csv as part of the aggr_CSV argument.

Cell Ranger ATAC

Loupe

10x Genomics
Chromium Single Cell ATAC

Cell Ranger ATAC count: Per Barcode QC, ATAC Signal, and Cell Calling

Structure

Column Definitions

Loading and using singlecell.csv in Python

Regenerate the targeting plot in web summary

Edit cell calling for use in aggr and reanalyze

About

Legal Notices

Resources

Headquarters

Social

Cell Ranger ATAC

Loupe

10x GenomicsChromium Single Cell ATAC

Cell Ranger ATAC count: Per Barcode QC, ATAC Signal, and Cell Calling

Structure

Column Definitions

Loading and using singlecell.csv in Python

Regenerate the targeting plot in web summary

Edit cell calling for use in aggr and reanalyze

10x Genomics
Chromium Single Cell ATAC