10x Genomics
Chromium Single Cell Immune Profiling

Cell Ranger6.1, printed on 03/13/2025

V(D)J Annotations

A V(D)J transcript has the following structure:

UTR: Untranslated region; FWR: Framework region; CDR: Complementarity determining region

The cellranger vdj pipeline provides amino acid and nucleotide sequences for framework and complementarity determining regions (CDRs). The V(D)J annotations on the assembled contigs and on the clonotype consensus sequences are produced in multiple formats.

Learn more about productive contigs on the Annotation Algorithm page.

File Type Overview

File type	Description
CSV	High-level annotations with one contig, consensus, or clonotype per row.
JSON	Detailed annotations, including alignment coordinates and amino acid translations.
BED	Germline V(D)J segments as features, for use with a tool like IGV.
TSV	Used for the AIRR rearrangement format of VDJ contigs and consensus sequences.

Annotation Files

File	Description
clonotypes.csv	High-level descriptions of each clonotype.
consensus_annotations.csv	High-level and detailed annotations of each clonotype consensus sequence.
filtered_contig_annotations.csv	High-level annotations of each high-confidence, cellular contig. This is a subset of `all_contig_annotations.csv`.
all_contig_annotations.{csv,bed,json}	High-level and detailed annotations of each contig.
airr_rearrangement.tsv	Annotated contigs and consensus sequences of VDJ rearrangements in the AIRR format.

Clonotype CSV File (clonotypes.csv)

Column	Description
clonotype_id	The ID of the clonotype to which this consensus sequence was assigned.
frequency	The observed number of cell barcodes with this clonotype.
proportion	The observed fraction of cell barcodes with this clonotype.
cdr3s_aa	A semicolon-delimited list of chain:sequence pairs, where chain is for example TRA, TRB, IGK, IGL, or IGH and sequence is the CDR3 amino acid sequence for that chain.
cdr3s_nt	A semicolon-delimited list of chain:sequence pairs, where chain is for example TRA, TRB, IGK, IGL, or IGH and sequence is the CDR3 nucleotide sequence for that chain.
inkt_evidence	For T cells, this column would contain the evidence, if any, that this clonotype is a group of iNKT cells. The evidence is semicolon-delimited list of `chain:matches`, where chain is one of TRA or TRB and matches is one of `genes`, `junction` or `genes+junction`. See iNKT/MAIT for more information
mait_evidence	For T cells, this column would contain the evidence, if any, that this clonotype is a group of MAIT cells. The evidence is semicolon-delimited list of `chain:matches`, where chain is one of TRA or TRB and matches is one of `genes`, `junction` or `genes+junction`. See iNKT/MAIT for more information

Contig Annotation CSV Files (*contig_annotations.csv)

Column	Description
barcode	Cell-barcode for this contig.
is_cell	True or False value indicating whether the barcode was called as a cell.
contig_id	Unique identifier for this contig.
high_confidence	True or False value indicating whether the contig was called as high-confidence (unlikely to be a chimeric sequence or some other artifact).
length	The contig sequence length in nucleotides.
chain	The chain associated with this contig; for example, TRA, TRB, IGK, IGL, or IGH. A value of "Multi" indicates that segments from multiple chains were present.
v_gene	The highest-scoring V segment, for example, TRAV1-1.
d_gene	The highest-scoring D segment, for example, TRBD1.
j_gene	The highest-scoring J segment, for example, TRAJ1-1.
c_gene	The highest-scoring C segment, for example, TRAC.
full_length	If the contig was declared as full-length.
productive	If the contig was declared as productive.
fwr1	The predicted FWR1 amino acid sequence.
fwr1_nt	The predicted FWR1 nucleotide sequence.
cdr1	The predicted CDR1 amino acid sequence.
cdr1_nt	The predicted CDR1 nucleotide sequence.
fwr2	The predicted FWR2 amino acid sequence.
fwr2_nt	The predicted FWR2 nucleotide sequence.
cdr2	The predicted CDR2 amino acid sequence.
cdr2_nt	The predicted CDR2 nucleotide sequence.
fwr3	The predicted FWR3 amino acid sequence.
fwr3_nt	The predicted FWR3 nucleotide sequence.
cdr3	The predicted CDR3 amino acid sequence.
cdr3_nt	The predicted CDR3 nucleotide sequence.
fwr4	The predicted FWR4 amino acid sequence.
fwr4_nt	The predicted FWR4 nucleotide sequence.
reads	The number of reads aligned to this contig.
umis	The number of distinct UMIs aligned to this contig.
raw_clonotype_id	The ID of the clonotype to which this cell barcode was assigned.
raw_consensus_id	The ID of the consensus sequence to which this contig was assigned.
exact_subclonotype_id	The ID of the exact subclontype to which this cell barcode was assigned.

Details on how the Cell Ranger algorithm delimits CDRs (Complementarity Determining Regions) and FWRs (Frame Work Regions) are provided on the enclone features page.

Consensus Annotation CSV Files (consensus_annotations.csv)

Column	Description
clonotype_id	The ID of the clonotype to which this consensus sequence was assigned.
consensus_id	The ID of this consensus sequence.
v_start	0-based index of the V region start position on the consensus sequence.
v_end	0-based index of the V region end position on the consensus sequence.
v_end_ref	0-based index of the V gene end position on the reference
j_start	0-based index of the J region start position on the consensus sequence.
j_start_ref	0-based index of the J gene start position on the reference.
j_end	0-based index of the J region end position on the consensus sequence.
cdr3_start	0-based index of the CDR3 region start position on the consensus sequence.
cdr3_end	0-based index of the CDR3 region end position on the consensus sequence.

The remaining columns are shared with those under the Contig Annotation CSV Files section.

AIRR Rearrangements TSV File (airr_rearrangement.tsv)

Column	Description
cell_id	Cell barcode defining the cell for the query sequence.
clone_id	Clonotype ID/clonotype assignment.
rev_comp	Set to `false` by default (10x Genomics VDJ sequences are not reverse complemented).
sequence_id	The name of the contig associated with the rearrangement.
sequence	The nucleotide sequence of the rearrangement.
sequence_aa	The amino acid sequence of the rearrangement.
productive	Whether or not the rearrangement is productive.
v_call	The name of the aligned V gene for the rearrangement.
v_cigar	The CIGAR string of the V gene alignment.
v_sequence_start	1-based index on the contig of the V region start position.
v_sequence_end	1-based index on the contig of the V region end position.
d_call	The name of the aligned D gene for the rearrangement.
d_cigar	The CIGAR string of the D gene alignment.
d_sequence_start	1-based index on the contig of the D region start position.
d_sequence_end	1-based index on the contig of the D region end position.
j_call	The name of the aligned J gene for the rearrangement.
j_cigar	The CIGAR string of the J gene alignment.
j_sequence_start	1-based index on the contig of the J region start position.
j_sequence_end	1-based index on the contig of the J region end position.
c_call	The name of the aligned C gene for the rearrangement.
c_cigar	The CIGAR string of the C gene alignment.
c_sequence_start	1-based index on the contig of the C region start position.
c_sequence_end	1-based index on the contig of the C region end position.
sequence_alignment	The aligned sequence of the VDJ rearrangement.
germline_alignment	The assembled, aligned, full-length inferred germline sequence of the aligned sequence.
junction	The nucleotide sequence of the rearrangement's junction (CDR3).
junction_aa	The amino acid sequence of the rearrangement's junction (CDR3).
duplicate_count	The number of unique molecular identifiers associated with this rearrangement.
consensus_count	The number of reads associated with this rearrangement.
junction_length	The length of the rearrangement's junction nucleotide sequence.
junction_aa_length	The length of the rearrangement's junction amino acid sequence.
is_cell	Is this rearrangement cell-associated?

The AIRR rearrangement file includes all mandatory AIRR fields and several optional variables to enhance reproducibility and guide analyses.

Cell Ranger

Loupe

10x Genomics
Chromium Single Cell Immune Profiling

V(D)J Annotations

File Type Overview

Annotation Files

Clonotype CSV File (clonotypes.csv)

Contig Annotation CSV Files (*contig_annotations.csv)

Consensus Annotation CSV Files (consensus_annotations.csv)

AIRR Rearrangements TSV File (airr_rearrangement.tsv)

About

Legal Notices

Resources

Headquarters

Social

Cell Ranger

Loupe

10x GenomicsChromium Single Cell Immune Profiling

V(D)J Annotations

File Type Overview

Annotation Files

Clonotype CSV File (clonotypes.csv)

Contig Annotation CSV Files (*contig_annotations.csv)

Consensus Annotation CSV Files (consensus_annotations.csv)

AIRR Rearrangements TSV File (airr_rearrangement.tsv)

10x Genomics
Chromium Single Cell Immune Profiling