HOME  ›   pipelines
If your question is not answered here, please email us at:  ${email.software}

10x Genomics
Chromium Single Cell Immune Profiling

V(D)J Annotations

cellranger vdj pipeline produces V(D)J annotations on the assembled contigs and on the clonotype consensus sequences in multiple formats.

File type overview

File type | Description

  •     |-
    

CSV | High-level annotations with one contig, consensus, or clonotype per row. JSON | Detailed annotations, including alignment coordinates and amino acid translations. BED | Germline V(D)J segments as features, for use with a tool like IGV.


Annotation files

File | Description

  •                                 |-
    

clonotypes.csv | High-level descriptions of each clonotype. consensus_annotations.{csv,json} | High-level and detailed annotations of each clonotype consensus sequence. filtered_contig_annotations.csv | High-level annotations of each high-confidence, cellular contig. This is a subset of all_contig_annotations.csv. all_contig_annotations.{csv,bed,json} | High-level and detailed annotations of each contig.


Clonotype CSV file (clonotypes.csv)

Column Description
clonotype_id The ID of the clonotype to which this consensus sequence was assigned.
frequency The observed number of cell-barcodes with this clonotype.
proportion The observed fraction of cell-barcodes with this clonotype.
cdr3s_aa A semicolon-delimited list of chain:sequence pairs, where "chain" is e.g., TRA or TRB and "sequence" is the CDR3 amino acid sequence for that chain.
cdr3s_nt A semicolon-delimited list of chain:sequence pairs, where "chain" is e.g., TRA or TRB and "sequence" is the CDR3 nucleotide sequence for that chain.

Contig annotation CSV files (*contig_annotations.csv)

name description
barcode Cell-barcode for this contig.
is_cell True/False value indicating whether the barcode was called as a cell.
contig_id Unique identifier for this contig.
high_confidence True/False value indicating whether the contig was called as high-confidence (unlikely to be a chimeric sequence or some other artifact).
length The contig sequence length in nucleotides.
chain The chain associated with this contig; e.g., "TRA" or "TRB". A value of "Multi" indicate that segments from multiple chains were present.
v_gene The highest-scoring V segment, e.g., TRAV1-1.
d_gene The highest-scoring D segment, e.g., TRBD1.
j_gene The highest-scoring J segment, e.g., TRAJ1-1.
c_gene The highest-scoring C segment, e.g., TRAC.
full_length The sequence spans the 5′ end of V to the 3′ end of J.
productive True/False/None value indicating whether the transcript is predicted to translate to a protein with a CDR3 region. "None" indicates that the contig does not span the 5′ end of a V region to the 3′ end of a J region, and so the produtivity of the transcript could not be determined. In addition to being V-J spanning, the sequence must have a detectable CDR3 region, have a start codon in the expected part of the V sequence, and have no stop codons in the V-J region.
cdr3 The predicted CDR3 amino acid sequence.
cdr3_nt The predicted CDR3 nucleotide sequence.
reads The number of reads aligned to this contig.
umis The number of distinct UMIs aligned to this contig.
raw_clonotype_id The ID of the clonotype to which this cell-barcode was assigned.
raw_consensus_id The ID of the consensus sequence to which this contig was assigned.

Consensus annotation CSV files (consensus_annotations.csv)

Column Description
clonotype_id The ID of the clonotype to which this consensus sequence was assigned.
consensus_id The ID of this consensus sequence.

The remaining columns are shared with those under the "Contig annotation CSV files" section.