HOME  ›   pipelines
If your question is not answered here, please email us at:  ${email.software}

10x Genomics
Chromium Single Cell Immune Profiling

Annotation Algorithm

Each assembled contig in each cell is aligned against all of the germline segment reference sequences via Smith-Waterman.

First the contig is aligned to all V reference sequences. The best match is found and the matching bases are masked from the contig. Then the same procedure is followed one-by-one for D, J, C, and 5′ UTR reference sequences.

Next, the CDR3 region is searched for in 2 different ways. If the sequence fully spans the L+V region, which contains the start codon, then search for a CDR3 motif (Cys-FGXG/WGXG) in that frame. Otherwise, search for a CDR3 sequence in all frames. A contig is labelled productive if it

Contig Filtering

It is expected that each cell-barcode typically contains one productive TRA and one productive TRB contig. Extra productive contigs produced by the assembler are less likely to be legitimate. For each chain, extra productive contigs with distinct CDR3s must have at least 2 UMIs to be considered confident. Extra productive contigs with 1 UMI are considered low-confidence.

Additionally, extra productive contigs with the same CDR3 as an existing contig for that chain are considered low-confidence; these are likely induced by assembly artifacts.

Clonotype Grouping and Consensus Building

Cell-barcodes are grouped together into clonotypes if they share a set of productive CDR3 nucleotide sequences by exact match.

For each clonotype and each CDR3, the contigs in all cells are assembled together to produce a clonotype consensus sequence.

Because this sequence is constructed using multiple cells, its accuracy is expected to be even higher than sequences constructed from a single cell.