Cell Ranger2.2, printed on 12/21/2024
cellranger vdj pipeline produces V(D)J annotations on the assembled contigs and on the clonotype consensus sequences in multiple formats.
File type | Description
|-
CSV | High-level annotations with one contig, consensus, or clonotype per row. JSON | Detailed annotations, including alignment coordinates and amino acid translations. BED | Germline V(D)J segments as features, for use with a tool like IGV.
File | Description
|-
clonotypes.csv | High-level descriptions of each clonotype.
consensus_annotations.{csv,json} | High-level and detailed annotations of each clonotype consensus sequence.
filtered_contig_annotations.csv | High-level annotations of each high-confidence, cellular contig. This is a subset of all_contig_annotations.csv
.
all_contig_annotations.{csv,bed,json} | High-level and detailed annotations of each contig.
Column | Description |
---|---|
clonotype_id | The ID of the clonotype to which this consensus sequence was assigned. |
frequency | The observed number of cell-barcodes with this clonotype. |
proportion | The observed fraction of cell-barcodes with this clonotype. |
cdr3s_aa | A semicolon-delimited list of chain:sequence pairs, where "chain" is e.g., TRA, TRB, IGK, IGL, or IGH and "sequence" is the CDR3 amino acid sequence for that chain. |
cdr3s_nt | A semicolon-delimited list of chain:sequence pairs, where "chain" is e.g., TRA, TRB, IGK, IGL, or IGH and "sequence" is the CDR3 nucleotide sequence for that chain. |
Name | Description |
---|---|
barcode | Cell-barcode for this contig. |
is_cell | True or False value indicating whether the barcode was called as a cell. |
contig_id | Unique identifier for this contig. |
high_confidence | True or False value indicating whether the contig was called as high-confidence (unlikely to be a chimeric sequence or some other artifact). |
length | The contig sequence length in nucleotides. |
chain | The chain associated with this contig; e.g., "TRA", "TRB", "IGK", "IGL", or "IGH". A value of "Multi" indicates that segments from multiple chains were present. |
v_gene | The highest-scoring V segment, e.g., TRAV1-1. |
d_gene | The highest-scoring D segment, e.g., TRBD1. |
j_gene | The highest-scoring J segment, e.g., TRAJ1-1. |
c_gene | The highest-scoring C segment, e.g., TRAC. |
full_length | A contig annotation is termed full-length if it has a valid V annotation (the contig aligns with at least 50% of the length of any V gene in the reference) and has a J gene annotation that spans until the 3′ end of the J region within one codon. |
productive | True, False, or None value predicting whether the transcript translates to a protein with a CDR3 region. "None" indicates that a prediction could not be made. |
cdr3 | The predicted CDR3 amino acid sequence. |
cdr3_nt | The predicted CDR3 nucleotide sequence. |
reads | The number of reads aligned to this contig. |
umis | The number of distinct UMIs aligned to this contig. |
raw_clonotype_id | The ID of the clonotype to which this cell-barcode was assigned. |
raw_consensus_id | The ID of the consensus sequence to which this contig was assigned. |
Column | Description |
---|---|
clonotype_id | The ID of the clonotype to which this consensus sequence was assigned. |
consensus_id | The ID of this consensus sequence. |
The remaining columns are shared with those under the "Contig annotation CSV files" section.