Space Ranger2.0 (latest), printed on 05/28/2023
ALIGN_AND_COUNT aligns the reads to the reference transcriptome and counts the number of reads and molecules per gene and barcode.
ALIGN_FIDUCIALS determines the position and orientation of the fiducial alignment grid in the tissue image.
ANALYZER_PREFLIGHT performs a series of preflight checks to validate inputs specific to the secondary analysis.
BARCODE_CORRECTION corrects sequencing errors in barcodes, allowing up to one mismatch.
CALCULATE_TARGETED_METRICS computes summary metrics and estimates gene enrichment for targeted and probe-based libraries.
CHOOSE_DIMENSION_REDUCTION determines which PCA stage to execute. If the Chemistry Batch Correction is enabled, the pipeline will run the RUN_FBPCA stage, otherwise, the RUN_PCA stage will be executed.
CHOOSE_DIMENSION_REDUCTION_OUTPUT retrieves the dimension reduced matrix from the RUN_FBPCA stage (with Chemistry Batch Correction) or the RUN_PCA stage.
CLOUPE_PREPROCESS produces the input file for Loupe Browser.
CLOUPE_TILE_IMAGES prepares a high-resolution tiled version of the tissue image for use in Loupe Browser.
COLLATE_METRICS outputs the per-barcode metrics per_barcode_metrics.csv.
COMBINE_CLUSTERING combines the different clustering results into a single file.
DETECT_TISSUE determines the regions of the slide covered by tissue.
DISABLE_BAMS disables the creation of BAM file output containing aligned reads, saving time and disk space if not needed.
FILTER_BARCODES removes barcodes not associated with spots under tissue.
GPR_READER reads the input file containing spot position data.
MAKE_SHARD reads the FASTQ files and extracts the barcode and UMI sequences.
MERGE_CLUSTERS merges graph-based clusters which have insufficient differential expression between them.
MERGE_METRICS outputs the metrics_summary.csv file.
MULTI_SETUP_CHUNKS determines the set of input FASTQ files, and associated metadata.
PARSE_TARGET_FEATURES parses the target panel or probe set reference CSV file to extract metadata and writes out the IDs of targeted genes and their indices within the transcriptome reference.
PREPROCESS_MATRIX prepares the feature-barcode matrix for secondary analysis.
RUN_DIFFERENTIAL_EXPRESSION identifies the most differentially expressed genes in each cluster relative to other clusters.
RUN_GRAPH_CLUSTERING - runs graph-based clustering and modularity optimization to partition the data into subpopulations based on PCA results.
RUN_KMEANS runs k-means clustering to partition the data into subpopulations based on PCA results.
RUN_PCA runs Principal Component Analysis (PCA) to reduce gene expression to its most highly variable components.
RUN_TSNE runs t-Distributed Stochastic Neighbor Embedding (t-SNE) to visualize PCA results in two dimensions.
RUN_UMAP runs the Uniform Manifold Approximation and Project algorithm to visualize PCA results in two dimensions.
RUN_SPATIAL_ENRICHMENT computes the Moran's I spatial autocorrelation metric for detected genes.
SET_ALIGNER_SUBSAMPLE_RATE sets the rate of subsampling for targeted spatial gene expression.
SPACERANGER_PREFLIGHT performs a series of preflight checks to confirm input arguments are valid before proceeding with analysis stages.
SPACERANGER_PREFLIGHT_LOCAL performs a series of preflight checks to confirm input arguments are valid before proceeding with analysis stages.
SPATIAL_REPORTER summarizes the image analysis results.
SPATIAL_RNA_COUNTER_PREP prepares various inputs to the main pipeline.
STANDARDIZE_IMAGES converts the tissue image into a standard format and size for downstream image processing.
SUBSAMPLE_OFF_TARGET_READS recomputes metrics for targeted and probe-based libraries on lower-depth versions of data from non-targeted genes.
SUBSAMPLE_ON_TARGET_READS recomputes metrics for targeted and probe-based libraries on lower-depth versions of data from targeted genes.
SUBSAMPLE_READS recomputes metrics on lower-depth versions of the data.
SUMMARIZE_ANALYSIS combines results of secondary steps into a single directory structure.
SUMMARIZE_BASIC_REPORTS summarizes the barcoding and alignment results.
SUMMARIZE_TARGETED_ANALYSIS merges JSONs from targeted analyzer into one, and adds target set information for building the web_summary.html.
WRITE_BARCODE_INDEX assigns a distinct integer to each barcode sequence.
WRITE_BARCODE_SUMMARY outputs the barcode summary HDF5 file barcode_summary.h5.
WRITE_GENE_INDEX emits a JSON file of the parsed gene index for use by the python code.
WRITE_H5_MATRIX outputs the feature-barcode matrix in the HDF5 file raw_feature_bc_matrix.h5.
WRITE_MATRIX_MARKET outputs the feature-barcode matrix in the Matrix Market file raw_feature_bc_matrix/matrix.mtx.gz.
WRITE_MOLECULE_INFO outputs per-molecule information for all molecules that contain a valid barcode and valid UMI and were assigned with high confidence to a gene to the molecule_info.h5 file.
WRITE_POS_BAM outputs the indexed BAM file containing position-sorted reads to the possorted_genome_bam.bam and possorted_genome_bam.bam.bai files.