Cell Ranger6.1, printed on 10/13/2024
The per_samples_outs/ directory is produced after a successful execution of the multi pipeline and contains filtered data, i.e., data from cell-associated barcodes in this sample. These are the main outputs of interest.
Contents of the following folders located within the per_samples_outs/ directory are described here. Click on the folder name below or scroll down to learn more.
Refer to the count and vdj pages for detailed explanations.
The count/ folder contains the results of 5' single cell gene expression analysis. The count directory looks like this:
├── count ├── analysis ├── cloupe.cloupe ├── sample_alignments.bam ├── sample_alignments.bam.bai ├── sample_barcodes.csv ├── sample_feature_bc_matrix ├── sample_feature_bc_matrix.h5 └── sample_molecule_info.h5
File/Folder | Description |
---|---|
analysis |
Folder containing the results of graph-based clusters and K-means clustering 2-10; differential gene expression analysis between clusters; and PCA, t-SNE, and UMAP dimensionality reduction. Learn more |
cloupe.cloupe |
A Loupe Browser readable file. |
sample_alignments.bam |
Indexed BAM file containing position-sorted reads aligned to the genome and transcriptome, as well as unaligned reads. Learn more |
sample_alignments.bam.bai |
Companion file to the sample_alignment.bam that serves as an external index |
sample_barcodes.csv |
File containing a list of barcodes associated with aligned reads. The barcode sequence ends in a suffix with a dash separator followed by a number. The number denotes a GEM well, and is used to virtualize barcodes in order to achieve a higher effective barcode diversity when combining samples generated from separate GEM chip channel runs. The number should be “1” across all barcodes when analyzing a sample from a single GEM well. The suffix-based preservation of GEM well information is especially useful when running cellranger aggr on multiple libraries generated from different GEM chip channels. |
sample_feature_bc_matrix |
Contains only detected cell-associated barcodes. Each element of the matrix is the number of UMIs associated with a feature (row) and a barcode (column). This file can be input into third-party packages and allows users to wrangle the barcode-feature matrix (e.g. to filter outlier cells, run dimensionality reduction, normalize gene expression). This file is similar to the filtered_feature_bc_matrix file described here |
sample_feature_bc_matrix.h5 |
Same information as sample_molecule_bc_matrix in H5 format. |
sample_molecule_info.h5 |
Contains per-molecule information for all molecules that contain a valid barcode and valid UMI and were assigned with high confidence to a gene or Feature Barcode. This file is a required input to run cellranger aggr . Learn more |
The vdj_t/ and vdj_b/ folders contain the results of V(D)J immune profiling analysis for T cells and B cells, respectively. The output file names and file structure in the vdj_b/ and vdj_t/ folders are identical, and are only described once. The vdj_t and vdj_b directories have this structure:
in this list we are missing all_contig_annotations.bed/csv/json/ files; all_contig.bam/.bam.bai/fasta/.fasta.fai/.fastq files. Is that by design? ├── vdj_b/t ├── airr_rearrangement.tsv ├── cell_barcodes.json ├── clonotypes.csv ├── concat_ref.bam ├── concat_ref.bam.bai ├── concat_ref.fasta ├── concat_ref.fasta.fai ├── consensus_annotations.csv ├── consensus.bam ├── consensus.bam.bai ├── consensus.fasta ├── consensus.fasta.fai ├── filtered_contig_annotations.csv ├── filtered_contig.fasta ├── filtered_contig.fastq ├── vdj_contig_info.pb └── vloupe.vloupe
File/Folder | Description |
---|---|
airr_rearrangement.tsv |
Annotated contigs and consensus sequences of V(D)J rearrangements in the AIRR format. Learn more |
cell_barcodes.json |
List of barcodes identified as T/B cells. |
clonotypes.csv |
High-level descriptions of each clonotype. Learn more |
concat_ref.bam |
For each clonotype consensus, each reference sequence is the annotated germline segments concatenated together. This file shows how both the per-cell contigs and the clonotype consensus contig relate to the germline reference. concat_ref.bam is expected to reveal polymorphisms, somatic mutations, and recombination-induced differences such as non-templated nucleotide additions. |
concat_ref.bam.bai |
Companion file to the concat_ref.bam that serves as an external index. |
concat_ref.fasta |
Concatenated V(D)J reference segments for the segments detected on each consensus sequence. These serve as an approximate reference for each consensus sequence. |
concat_ref.fasta.fai |
Companion file to the concat_ref.fasta that serves as an external index. |
consensus_annotations.csv |
High-level and detailed annotations of each clonotype consensus sequence. |
consensus.bam |
Each reference sequence is a clonotype consensus sequence, and each record is an alignment of a single cell's contig against this consensus. For a clonotype consensus sequence, this file shows how the constituent per-cell assemblies support the consensus. |
consensus.bam.bai |
Companion file to the consensus.bam that serves as an external index. |
consensus.fasta |
Clonotype consensus sequences. |
consensus.fasta.fai |
Companion file to the consensus.fasta that serves as an external index. |
filtered_contig_annotations.csv |
High-level annotations of each high-confidence, cellular contig. This is a subset of all_contig_annotations.csv. Learn more |
filtered_contig.fasta |
High-confidence contig sequences in cell barcodes in FASTA format. |
filtered_contig.fastq |
High-confidence contig sequences in cell barcodes in FASTQ format. |
vdj_contig_info.pb |
This file stores the contig annotations, V(D)J reference and additional metadata in a protobuf binary file format. This file is required to run the cellranger aggr pipeline. Learn more |
vloupe.vloupe |
Loupe V(D)J Browser readable file. |