HOME  ›   pipelines
If your question is not answered here, please email us at:  ${email.software}

10x Genomics
Visium Spatial Gene Expression

Single-Library Analysis with spaceranger count

Space Ranger's pipelines analyze sequencing data produced from Visium Spatial Gene Expression. The analysis involves the following steps:

  1. Run spaceranger mkfastq on the Illumina BCL output folder to generate FASTQ files.

  2. Run spaceranger count on each Capture Area that was demultiplexed by spaceranger mkfastq.

For the following example, assume that the Illumina BCL output is in a folder named /sequencing/140101_D00123_0111_AHAWT7ADXX.

Run spaceranger mkfastq

First, follow the instructions on running spaceranger mkfastq to generate FASTQ files. For example, if the flowcell serial number was HAWT7ADXX, then spaceranger mkfastq will output FASTQ files in HAWT7ADXX/outs/fastq_path.

Run spaceranger count

Automatic Alignment

To generate spatial feature counts for a single library using automatic fiducial alignment and tissue detection, run spaceranger count with the following arguments. For a complete listing of the arguments accepted, see the Command Line Argument Reference below, or run spaceranger count --help.

After determining these input arguments, run spaceranger:

$ cd /home/jdoe/runs
$ spaceranger count --id=sample345 \
                   --transcriptome=/opt/refdata/GRCh38-3.0.0 \
                   --fastqs=/home/jdoe/runs/HAWT7ADXX/outs/fastq_path \
                   --sample=mysample \
                   --image=/home/jdoe/runs/images/sample345.tif \
                   --slide=V19J01-123 \
                   --area=A1 \
                   --localcores=8 \
                   --localmem=64 
                   

Manual Alignment

To generate spatial feature counts for a single library using a fiducial alignment and tissue assignment json file generated in Loupe Browser, run spaceranger count with the following arguments.

After determining these input arguments, run spaceranger:

$ cd /home/jdoe/runs
$ spaceranger count --id=sample345 \
                   --transcriptome=/opt/refdata/GRCh38-3.0.0 \
                   --fastqs=/home/jdoe/runs/HAWT7ADXX/outs/fastq_path \
                   --sample=mysample \
                   --image=/home/jdoe/runs/images/sample345.tif \
                   --slide=V19J01-123 \
                   --area=A1 \
                   --loupe-alignment=sample345.json \
                   --localcores=8 \
                   --localmem=64 
                   

Following a set of preflight checks to validate input arguments, spaceranger count pipeline stages will begin to run:

Martian Runtime - 3.2.5
 
Running preflight checks (please wait)...
2016-11-10 14:23:52 [runtime] (ready)           ID.sample345.SPATIAL_RNA_COUNTER_CS.SPATIAL_RNA_COUNTER_PREP.SETUP_CHUNKS
2016-11-10 14:23:55 [runtime] (split_complete)  ID.sample345.SPATIAL_RNA_COUNTER_CS.SPATIAL_RNA_COUNTER_PREP.SETUP_CHUNKS
2016-11-10 14:23:55 [runtime] (run:local)       ID.sample345.SPATIAL_RNA_COUNTER_CS.SPATIAL_RNA_COUNTER_PREP.SETUP_CHUNKS.fork0.chnk0.main
...

By default, spaceranger will use all of the cores available on your system to execute pipeline stages. You can specify a different number of cores to use with the --localcores option; for example, --localcores=16 will limit spaceranger to using up to sixteen cores at once. Similarly, --localmem will restrict the amount of memory (in GB) used by spaceranger.

The pipeline will create a new folder named with the sample ID you specified (e.g. /home/jdoe/runs/sample345) for its output. If this folder already exists, spaceranger will assume it is an existing pipestance and attempt to resume running it.

Slide Serial and Capture Area Parameters

The spaceranger count pipeline accepts slide serial and capture area arguments, in order to use the most precise fiducial and spot coordinates for an experiment. The easiest way to pass this information to spaceranger count is via the --slide and --area arguments. When --slide is specified, the pipeline will download the layout file associated with the supplied serial number. If spaceranger is run in an environment without access to the outside Internet, follow the instructions below in order to download a slide file locally.

If you do not know the serial number or capture area associated with the experiment, you can still run spaceranger via the --unknown-slide option. When specified, spaceranger will use a default layout file for spot and fiducial coordinates. The typical per-spot difference between the default layout and a specific slide is under 10 microns.

Downloading a Slide File for Local Operation

If the spaceranger is to be run in an environment without access to the Internet, the pipeline will require a Visium slide layout file via the --slidefile argument. You can download a layout file for a Visium slide below. Enter the serial number of the slide (e.g., V19S01-123) and press 'Download'. The layout file will start to download.

Output Files

A successful spaceranger count run concludes with a message similar to this:

2016-11-10 16:10:09 [runtime] (join_complete)   ID.sample345.SPATIAL_RNA_COUNTER_CS.SPATIAL_RNA_COUNTER_CS.SUMMARIZE_REPORTS
 
Outputs:
- Run summary HTML:                         /opt/sample345/outs/web_summary.html
- Outputs of spatial pipeline:              /opt/sample345/outs/spatial
- Run summary CSV:                          /opt/sample345/outs/metrics_summary.csv
- BAM:                                      /opt/sample345/outs/possorted_genome_bam.bam
- BAM index:                                /opt/sample345/outs/possorted_genome_bam.bam.bai
- Filtered feature-barcode matrices MEX:    /opt/sample345/outs/filtered_feature_bc_matrix
- Filtered feature-barcode matrices HDF5:   /opt/sample345/outs/filtered_feature_bc_matrix.h5
- Unfiltered feature-barcode matrices MEX:  /opt/sample345/outs/raw_feature_bc_matrix
- Unfiltered feature-barcode matrices HDF5: /opt/sample345/outs/raw_feature_bc_matrix.h5
- Secondary analysis output CSV:            /opt/sample345/outs/analysis
- Per-molecule read information:            /opt/sample345/outs/molecule_info.h5
- Loupe Browser file:                       /opt/sample345/outs/cloupe.cloupe
 
Pipestance completed successfully!

The output of the pipeline is contained in a folder named with the sample ID you specified (e.g. sample345). The subfolder named outs contains the main pipeline output files:

File NameDescription
web_summary.htmlRun summary metrics and charts in HTML format
spatialDirectory containing QC images for aligned fiducials and detetected tissue in jpg format, scalefactors_json.json, high and low resolution versions of the input image in png format, and tissue_positions_list.csv
spatial/aligned_fiducials.jpgAligned fiducials QC image
spatial/detected_tissue_image.jpgDetected tissue QC image
spatial/detected_tissue_image.pngFull resolution image downsampled to 2k pixels on the longest dimension
spatial/detected_tissue_image.pngFull resolution image downsampled to 600 pixels on the longest dimension
spatial/tissue_positions_list.csvCSV containing spot barcode, if the spot was called under (1) or out (0) of tissue, the array position, image pixel position x, and image pixel postion y for the full resolution image
spatial/scalefactors_json.jsonContains spot diameter estimation in pixels for the full resolution original image, tissue_hires_scalef which is the spot poisition multiplier in pixels for the high resolution image, fiducial spot diameter estimation in pixels for the full resolution original image, and tissue_hires_scalef which is the spot poisition multiplier in pixels for the low resolution image
metrics_summary.csvRun summary metrics in CSV format
possorted_genome_bam.bamReads aligned to the genome and transcriptome annotated with barcode information
possorted_genome_bam.bam.baiIndex for possorted_genome_bam.bam
filtered_feature_bc_matrixFiltered feature-barcode matrices containing only spot barcodes in MEX format
filtered_feature_bc_matrix_h5.h5Filtered feature-barcode matrices containing only spot barcodes in HDF5 format
raw_feature_bc_matricesUnfiltered feature-barcode matrices containing all barcodes in MEX format
raw_feature_bc_matrix_h5.h5Unfiltered feature-barcode matrices containing all barcodes in HDF5 format
analysisSecondary analysis data including dimensionality reduction, spot clustering, and differential expression
molecule_info.h5Molecule-level information used by spaceranger aggr to aggregate samples into larger datasets.
cloupe.cloupeLoupe Browser visualization and analysis file

Once spaceranger count has successfully completed, you can browse the resulting summary HTML file in any supported web browser, open the .cloupe file in Loupe Browser, or refer to the Understanding Output section to explore the data by hand.

Command-Line Argument Reference

ArgumentDescription
--idA unique run ID string: e.g. sample345
--fastqsEither:
Path of the fastq_path folder generated by spaceranger mkfastq
e.g. /home/jdoe/runs/HAWT7ADXX/outs/fastq_path. This contains a directory hierarchy that spaceranger count will automatically traverse.
- OR -
Any folder containing fastq files, for example if the fastq files were generated by a service provider and delivered outside the context of the mkfastq output directory structure.
Can take multiple comma-separated paths, which is helpful if the same library was sequenced on multiple flowcells.
Doing this will treat all reads from the library, across flowcells, as one sample.
--sampleSample name as specified in the sample sheet supplied to spaceranger mkfastq.
Can take multiple comma-separated values, which is helpful if the same library was sequenced on multiple flowcells and the sample name used (and therefore fastq file prefix) is not identical between them.
Doing this will treat all reads from the library, across flowcells, as one sample.
Allowable characters in sample names are letters, numbers, hyphens, and underscores.
--transcriptomePath to the Space Ranger compatible transcriptome reference e.g. /opt/GRCh38-3.0.0
--imageBrightfield tissue H&E image in .jpg or .tiff format.
--slideVisium slide serial number. Required unless --unknown-slide is passed.
--areaVisium capture area identifier. Required unless --unknown-slide is passed. Options for Visium are A1, B1, C1, D1
--slidefileSlide layout file indicating capture spot and fiducial spot positions
--loupe-alignmentAlignment file produced by the manual Loupe alignment step. A --image must be supplied in this case.
--unknown-slideSet this if the slide serial number and area identifier are unknown. Setting this will cause Space Ranger to use default spot positions. Not compatible with --slide, --area, or --slidefile.
--nosecondary(optional) Disable secondary analysis, e.g. dimensionality reduction, clustering and visualization.
--r1-length(optional) Hard-trim the input R1 sequence to this length. Note that the length includes the Barcode and UMI sequences so do not set this below 28. This and --r2-length are useful for determining the optimal read length for sequencing.
--r2-length(optional) Hard-trim the input R2 sequence to this length.
--lanes(optional) Lanes associated with this sample
--localcoresRestricts spaceranger to use specified number of cores to execute pipeline stages. By default, spaceranger will use all of the cores available on your system.
--localmemRestricts spaceranger to use specified amount of memory (in GB) to execute pipeline stages. By default, spaceranger will use 90% of the memory available on your system.
--indices(Deprecated. Optional. Only used for output from spaceranger demux) Sample indices associated with this sample. Comma-separated list of:
  1. index set plate well: SI-3A-A1
  2. index sequences: TCGCCATA,GTATACAC