HOME  ›   pipelines
If your question is not answered here, please email us at:  ${email.software}

10x Genomics
Chromium Single Cell Immune Profiling

V(D)J Annotations

cellranger vdj pipeline produces V(D)J annotations on the assembled contigs and on the clonotype consensus sequences in multiple formats.

File type overview

File type | Description

  •     |-
    

CSV | High-level annotations with one contig, consensus, or clonotype per row. JSON | Detailed annotations, including alignment coordinates and amino acid translations. BED | Germline V(D)J segments as features, for use with a tool like IGV.


Annotation files

File | Description

  •                                 |-
    

clonotypes.csv | High-level descriptions of each clonotype. consensus_annotations.{csv,json} | High-level and detailed annotations of each clonotype consensus sequence. filtered_contig_annotations.csv | High-level annotations of each high-confidence, cellular contig. This is a subset of all_contig_annotations.csv. all_contig_annotations.{csv,bed,json} | High-level and detailed annotations of each contig.


Clonotype CSV file (clonotypes.csv)

Column Description
clonotype_id The ID of the clonotype to which this consensus sequence was assigned.
frequency The observed number of cell-barcodes with this clonotype.
proportion The observed fraction of cell-barcodes with this clonotype.
cdr3s_aa A semicolon-delimited list of chain:sequence pairs, where "chain" is e.g., TRA, TRB, IGK, IGL, or IGH and "sequence" is the CDR3 amino acid sequence for that chain.
cdr3s_nt A semicolon-delimited list of chain:sequence pairs, where "chain" is e.g., TRA, TRB, IGK, IGL, or IGH and "sequence" is the CDR3 nucleotide sequence for that chain.

Contig annotation CSV files (*contig_annotations.csv)

Name Description
barcode Cell-barcode for this contig.
is_cell True or False value indicating whether the barcode was called as a cell.
contig_id Unique identifier for this contig.
high_confidence True or False value indicating whether the contig was called as high-confidence (unlikely to be a chimeric sequence or some other artifact).
length The contig sequence length in nucleotides.
chain The chain associated with this contig; e.g., "TRA", "TRB", "IGK", "IGL", or "IGH". A value of "Multi" indicates that segments from multiple chains were present.
v_gene The highest-scoring V segment, e.g., TRAV1-1.
d_gene The highest-scoring D segment, e.g., TRBD1.
j_gene The highest-scoring J segment, e.g., TRAJ1-1.
c_gene The highest-scoring C segment, e.g., TRAC.
full_length A contig annotation is termed full-length if it has a valid V annotation (the contig aligns with at least 50% of the length of any V gene in the reference) and has a J gene annotation that spans until the 3′ end of the J region within one codon.
productive True, False, or None value predicting whether the transcript translates to a protein with a CDR3 region. "None" indicates that a prediction could not be made.
cdr3 The predicted CDR3 amino acid sequence.
cdr3_nt The predicted CDR3 nucleotide sequence.
reads The number of reads aligned to this contig.
umis The number of distinct UMIs aligned to this contig.
raw_clonotype_id The ID of the clonotype to which this cell-barcode was assigned.
raw_consensus_id The ID of the consensus sequence to which this contig was assigned.

Consensus annotation CSV files (consensus_annotations.csv)

Column Description
clonotype_id The ID of the clonotype to which this consensus sequence was assigned.
consensus_id The ID of this consensus sequence.

The remaining columns are shared with those under the "Contig annotation CSV files" section.