Space Ranger1.3, printed on 12/24/2024
In this tutorial you will:
For successful run of this tutorial, you must:
This tutorial is written with spaceranger v1.3.1. You can copy/paste all the commands listed in the tutorial into your command prompt to follow along. |
You can download and install spaceranger in any location. For this tutorial, we will create a working directory spaceranger_tutorial
and continue all the remaining steps in it.
# Create working directory mkdir spaceranger_tutorial # Change directory cd spaceranger_tutorial
To install the latest version of spaceranger
curl
or wget
). You should see a download progress status similar to the output below.% Total % Received % Xferd Average Speed Time Time Time Current Dload Upload Total Spent Left Speed 10 1220M 10 125M 0 0 30.9M 0 0:00:39 0:00:04 0:00:35 30.8M
This downloads the spaceranger tarball spaceranger-1.3.1.tar.gz
to your working directory. Next, we extract the contents.
# Extract spaceranger tarball
tar -zxvf spaceranger-1.3.1.tar.gz
spaceranger-1.3.1/ spaceranger-1.3.1/.env.json spaceranger-1.3.1/.version spaceranger-1.3.1/LICENSE spaceranger-1.3.1/builtwith.json spaceranger-1.3.1/sourceme.bash spaceranger-1.3.1/sourceme.csh spaceranger-1.3.1/bin/ spaceranger-1.3.1/bin/_spaceranger_internal spaceranger-1.3.1/bin/spaceranger spaceranger-1.3.1/bin/rna/ spaceranger-1.3.1/bin/rna/_includes ...
When the extraction process is finished, you will have access to the command prompt and the folder spaceranger-1.3.1
will be created in the working directory.
The spaceranger-1.3.1
folder contains the executable and all of the required dependencies. The key folders that you would use are highlighted in bold.
spaceranger-1.3.1 ├── bin ├── external │ ├── anaconda │ ├── martian │ │ └── jobmanagers │ ├── spaceranger_tiny_inputs │ └── spaceranger_tiny_ref ├── lib │ ├── bin │ │ ├── bamtofastq │ │ └── ... │ └── python │ └── cellranger │ └── barcodes │ ├── visium-v1_coordinates.txt │ ├── visium-v2_coordinates.txt │ └── ... ├── mro ├── probe_sets │ └── Visium_Human_Transcriptome_Probe_Set_v1.0_GRCh38-2020-A.csv └── target_panels ├── gene_signature_v1.0_GRCh38-2020-A.target_panel.csv ├── immunology_v1.0_GRCh38-2020-A.target_panel.csv ├── neuroscience_v1.0_GRCh38-2020-A.target_panel.csv └── pan_cancer_v1.0_GRCh38-2020-A.target_panel.csv
target_panels
folder contains the predesigned gene panels used in Targeted GEX analysisprobe_sets
contains the probe set reference CSV file used in analysis for FFPE sampleslib/python/cellranger/barcodes
folder contains visium barcodes whitelist and their coordinates on the slidelib/bin
folder contains tools such as bamtofastq which is to convert 10x Genomics BAM files to FASTQ.external/spaceranger_tiny_ref
and external/spaceranger_tiny_inputs
are utilized for spaceranger testrunexternal/martian/jobmanagers
folder contains sample templates for commonly used job schedulers$PATH
spaceranger is now installed. There are two ways to specify spaceranger in the commands.
spaceranger-1.3.1
folder# Change directory to spaceranger-1.3.1 cd spaceranger-1.3.1 # Get the full path pwd # Change working directory back to spaceranger_tutorial cd ..
# Get the full path
readlink -f spaceranger-1.3.1
/PATH/TO/WORKING_DIRECTORY/spaceranger_tutorial/spaceranger-1.3.1
The code in red will change depending on the compute setup you are using.
spaceranger-1.3.1
to your $PATH variable# Get the full path readlink -f spaceranger-1.3.1 # Export PATH by providing the full path export PATH=/PATH/TO/WORKING_DIRECTORY/spaceranger_tutorial/spaceranger-1.3.1:$PATH # Confirm installation which spaceranger
# Change directory to spaceranger-1.3.1 cd spaceranger-1.3.1 # Export PATH by specifying a shell variable export PATH=$PWD:$PATH # Confirm installation which spaceranger # Change working directory back to spaceranger_tutorial cd ..
~/spaceranger_tutorial/spaceranger-1.3.1The tilde symbolizes your home directory which will be same as
/PATH/TO/WORKING_DIRECTORY
as before.
Adding variables using export lasts for only the current login session. Add the export command using the full path to spaceranger-1.3.1 to the shell configuration file (e.g. .bashrc, .zshrc etc) which is triggered for every login.
|
You can now invoke spaceranger at the command prompt to see the usage statement.
# When using full path to the spaceranger folder
/PATH/TO/WORKING_DIRECTORY/spaceranger_tutorial/spaceranger-1.3.1/spaceranger
# When adding spaceranger folder to the $PATH variable
spaceranger
USAGE: spaceranger <SUBCOMMAND> FLAGS: -h, --help Prints help information -V, --version Prints version information SUBCOMMANDS: count Count gene expression and feature barcoding reads from a single capture area aggr Aggregate data from multiple 'spaceranger count' runs... testrun Execute the 'count' pipeline on a small test dataset upload Upload analysis logs to 10x Genomics support sitecheck Collect linux system configuration information help Prints this message or the help of the given subcommand(s)
For the rest of the tutorial, we will invoke spaceranger assuming addition of the spaceranger-1.3.1
folder to the $PATH
variable.
spaceranger sitecheck enables you to check your system configuration to ensure it meets the minimum recommended requirements. Run the command and use >
to re-direct the output to a text file.
spaceranger sitecheck > sitecheck.txt
If running spaceranger in cluster mode, run the sitecheck on both the head node and the worker node. |
Open the file with less and use /
(e.g. /CPU Cores
) to search for specific sections with the file. Press to quit.
less sitecheck.txt
We will examine some key configuration metrics and compare against the recommended system requirements.
CPU Cores grep -c processor /proc/cpuinfo --------------------------------------------------------------------- 96 ===================================================================== ...
This system has 96 CPUs and is capable of running spaceranger which requires at least 8 CPUs, preferably 32.
Memory Total grep MemTotal /proc/meminfo | cut -d ':' -f 2 | sed 's/^[ \t]*//' --------------------------------------------------------------------- 289287896 kB ===================================================================== ...
For direct comparison, let's convert kB to GB. $${\text {RAM in GB}} = \frac{289287896}{1\mathrm{e}{+6}} \approx 289$$
which satisfies the requirement of having at least 64GB RAM, preferably 128.
User Limits bash -c 'ulimit -a' --------------------------------------------------------------------- core file size (blocks, -c) 0 data seg size (kbytes, -d) unlimited scheduling priority (-e) 0 file size (blocks, -f) unlimited pending signals (-i) 1520514 max locked memory (kbytes, -l) 64 max memory size (kbytes, -m) unlimited open files (-n) 10240 pipe size (512 bytes, -p) 8 POSIX message queues (bytes, -q) 819200 real-time priority (-r) 0 stack size (kbytes, -s) 8192 cpu time (seconds, -t) unlimited max user processes (-u) 131072 virtual memory (kbytes, -v) unlimited file locks (-x) unlimited =====================================================================
The two metrics to consider are highlighted in bold.
a. for the max user processes, the recommendation is the limit to be 64 per core. Assuming we use all 96 cores, $96*64 = 6,144 < 1,31,072$
b. for max open files, the system limit $10,240 < 16,000$ which is the recommendation. While the pipelines may run at lower open file limit, caution is urged. This value is dependent on the system, the sample type and number of samples being run. In case the pipeline errors, it is advisable to increase the user limit ulimit
and try again.
Global File Limit cat /proc/sys/fs/file-{max,nr} --------------------------------------------------------------------- 2921445 68736 0 262144 =====================================================================
The value satisfies the minimum requirement of 10k per GB RAM $10,000*289 = 28,90,000 < 29,21,445$, where 289 GB is the total memory of the system.
The software support team can review your sitecheck results. There are two ways to send it across
spaceranger upload [email protected] sitecheck.txt
sitecheck.txt
as an attachment to [email protected]We can verify the installation using spaceranger testrun. This pipeline can be run in two configurations depending on the internet connectivity of the compute platform.
spaceranger testrun --id=verify_install
spaceranger testrun --no-internet --id=verify_install
Martian Runtime - v4.0.5 Running preflight checks (please wait)... Checking sample info... Checking FASTQ folder... Checking reference... Checking reference_path Checking optional arguments... ... Pipestance completed successfully!
Successful completion of the testrun by extension implies successful installation of spaceranger.
Q. How can I use multiple versions of spaceranger ?
Sometimes it is useful to have access to older as well as newer versions of spaceranger. There are two suggested ways to achieve this:
Update $PATH
to point to the latest version
Since the spaceranger tarball comes annotated with the version number, you can download and uninstall the latest version and subsequently update the $PATH variable to point to the version you wish to use.
Use virtual environments
You can install and set up conda which functions as both package and environment manager. Use of virtual environments for running software provides many useful benefits such as reproducibility, compatibility, versioning as well as giving admin permissions on shared compute environments such as High-Performance Computing clusters (HPCs).