HOME  ›   pipelines

Antigen Algorithm Overview

This page describes the unique aspects of the Cell Ranger multi algorithm designed to process Antigen Capture libraries.

Overview

The cellranger multi algorithm for processing Chromium Single Cell 5’ Barcode Enabled Antigen Mapping (BEAM)/Antigen Capture, Gene Expression, V(D)J, and Antibody Capture libraries involves the following steps, some of which occur independently and iteratively:

Cell calling takes place for both Gene expression and V(D)J barcodes. However, after V(D)J cell calling is complete, any barcodes that were not also assigned as "cells" in the corresponding 5' Gene Expression dataset are discarded. This step mitigates overcalling issues that may arise in V(D)J data, making V(D)J cells a subset of the Gene Expression cells. You can learn more about it in the Why use multi? section of the cellranger multi pipeline page.

Antigen and antibody aggregate detection happens prior to cell calling. Barcodes identified as aggregates are filtered out. The remaining barcodes are assigned as "cells" and then the feature-barcode matrix is computed. Antigen specificity scores, .cloupe, and .vloupe files are also generated from filtered data.

Antigen aggregate detection and filtering

Antigen aggregates in BEAM experiments cause a few GEMs to have extremely high UMI counts. During aggregate detection, the algorithm selects the top 100 cell barcodes with the highest antigen UMIs based on the pre-filtered antigen barcode rank plot. Then, outliers are identified and filtered out from this set of 100 barcodes based on the interquartile range (IQR) of

$$Q3 + (Q3-Q1)*3$$

Although both antigen and antibody aggregates are reported in the same CSV file (aggregate_barcodes.csv), the process of detecting and filtering out antigen aggregates is different from that used for antibody aggregates.

Antigen specificity score

Antigen specificity is a likelihood that indicates how strongly a cell barcode is associated with a target antigen compared to the negative control antigen. An antigen specificity score is calculated per barcode for each antigen-negative control pair specified in the Feature Reference CSV and multi config CSV. Antigen specificity scores are not calculated if a negative control antigen is not specified.

The antigen specificity score is computed as 100 times the probability that the “true value” of $$S / (S + N) >= 0.925%$$

where S is the UMI count for the target antigen, and N is the UMI count for the negative control antigen.

This number should be close to 100 for antigens that associate strongly and much lower for antigens with poor or non-specific associations.

The precise definition of the antigen specificity score is:

$$(1 - beta.cdf(0.925, S + SignalPRIOR, N + NoisePRIOR)) * 100)$$

Where, SignalPRIOR = 1 and NoisePRIOR = 3

Antigen specificity scores are output in the /outs/per_sample_outs/sample_name/antigen_analysis/antigen_specificity_scores.csv file. Visit the Filtered Outputs page of the Understanding Outputs to learn more about the contents of antigen_specificity_scores.csv.