Cell Ranger ATAC2.1, printed on 12/03/2024
Entries are ordered alphabetically.
Adapters: customized strands of base pairs created to bind with specific sequences of DNA.
ATAC: Assay for Transposase Accessible Chromatin.
Barcode: each GEM (Gelbead-in-Emulsion) contains a Gel Bead which carries many DNA oligos with the same barcode. Different GEMs have different barcodes. See also GEM and GEM Well.
Cell Barcode: any barcodes that have been determined by the 'cell-calling' step of the pipeline to be associated with cells.
Chromatin: a macromolecular complex formed by DNA, nucleosomes and other proteins that bind DNA (for example transcription factors).
Cut-site: a genome location where transposase cuts the DNA and inserts adapters.
Duplicates: two read pairs that originate from the same template molecule are called duplicates. Duplicates arise during the library preparation workflow when template molecules are amplified via PCR or linear amplification. Additionally, duplicate reads could also arise during the sequencing process and are generally referred to as optical duplicates. Duplicate reads provide redundant information and are identified computationally and collapsed into a single fragment record for downstream analysis.
Enhancer: a short (50–1500 bp) region of DNA that can be bound by transcription factors to increase the likelihood that transcription of a particular gene will occur.
Fragment: a piece of genomic DNA, bounded by two adjacent cut sites, that has been converted into a sequencer-compatible molecule with an attached cell-barcode. The alignment interval of the fragment is obtained by correcting the alignment interval of the sequenced fragment by +4 bp on the left end of the fragment, and -5bp on the right end (where left and right are relative to genomic coordinates). This is to account for the 9 bp of DNA that the tranposase occupies when it cuts the DNA (accessibility is recorded around the center of this 9 bp stretch; see figure in Algorithms). Most fragment-based metrics computed by the pipeline are based on fragments that passed various quality filters.
GEM: a Gelbead-in-Emulsion; a droplet containing some sample volume and barcoded Gel Bead, forming an isolated reaction volume. When referring to the subset of the sample contained in the droplet, the term 'partition' may also be used. See also GEM Well.
GEM Well (or GEM group): a set of partitioned cells (Gelbeads-in-Emulsion) from a single 10x Genomics Chromium chip channel. One or more sequencing libraries can be derived from a single GEM well. See also GEM.
Histone: a protein found in eukaryotic cell nuclei that forms nucleosomes.
Library (or Sequencing library): a 10x-barcoded sequencing library prepared from a single sample, corresponding to a single GEM well of a 10x Genomics Chromium run.
Nucleosome: structural units formed by histones that help package the eukaryotic DNA into well organized chromosomes.
Peak: a compact region of the genome identified as having 'open chromatin' due to an enrichment of cut-sites inside the region.
Promoter: a promoter is a region of DNA that initiates transcription of a particular gene. Promoters are located near the transcription start sites of genes, on the same strand and upstream on the DNA
Read Data: raw genomic data from sequenced DNA.
Read-pair: read data sequenced from one molecule. This includes read1, read2, and the barcode sequence read.
Sample: a cell or nuclei suspension extracted from a single biological source (blood, tissue, etc).
Sequencing Run: a flow cell containing data from one sequencing instrument run. The sequencing data can be further addressed by lane and by one or more sample indices.
Targeted region: any known, annotated, epigenetically relevant regions in the genome such as transcription start sites (TSS), enhancers, promoters or DNase hypersensitive sites. The pipeline metrics often refer to these targeted regions.
Transcription Factor (TF): a protein that controls the rate of transcription of genetic information from DNA to messenger RNA, by binding to a specific DNA sequences (like promoter or enhancers) that are commonly located in the vicinity of the gene they control.
Transposase: an enzyme that cuts open chromatin and ligates adapters to the 3' end of each strand.
Transposition: a reaction carried out by the transposase enzyme.
TSS: the transcription start site where transcription starts at the 5'-end of a gene sequence.
Wavelet transform: a method to transform a one-dimensional signal (which can be thought of a time series) into a sum of linearly independent basis functions (wavelets) that are localized in time. It can be thought of as a generalization of the familiar Fourier transform that decomposes a time-varying signal into single-frequency sinusoidal waveforms.