Long Ranger2.0, printed on 11/22/2024
Long Ranger algorithms are tuned and optimized for human haplotype phasing and structural variant calling, and 10x Genomics provides pre-built hg19 and b37-style reference packages for use with the pipeline. The pre-built references have the following characteristics:
Use of the pre-built references is strongly recommended unless you have specific requirements that match one of the compatible use cases below:
Long Ranger supports the use of customer-generated references for the following scenarios:
The following scenarios are not currently supported by Long Ranger:
There are 3 steps to construct a Long Ranger-compatible reference. You must have version 1.1.1 or later of both Long Ranger and Loupe.
To create a reference, run the longranger mkref command on your FASTA file. The contigs in your FASTA must meet the compatibility requirements above.
$ longranger mkref hsapiens-asm19.fasta ... indexing may take over an hour ... $ ls refdata-hsapiens-asm19 fasta/ genes/ genome regions/ snps/
This utility copies your FASTA, indexes it in several formats, and outputs a folder named refdata-<fasta_name>
.
At this point, the reference folder created by longranger mkref is usable by Long Ranger, but it is strongly recommended that you also include a region blacklist for structural variant calling.
For hg19 references, we provide pre-built blacklist files that you can simply copy into your reference. Follow the instructions below, depending on the naming convention of your reference:
hg19 Convention ("chr1")
$ cd refdata-hsapiens-asm19 $ cd regions $ wget https://cf.10xgenomics.com/supp/genome/hg19/sv_blacklist.bed $ wget https://cf.10xgenomics.com/supp/genome/hg19/segdups.bedpe
b37 Convention ("1")
$ cd refdata-hsapiens-asm19 $ cd regions $ wget https://cf.10xgenomics.com/supp/genome/b37/sv_blacklist.bed $ wget https://cf.10xgenomics.com/supp/genome/b37/segdups.bedpe
For all other references, follow these instructions to create custom blacklist files.
You may also limit variant calling to a subset of contigs in the reference by including a primary_contigs.txt file in the fasta directory. This is a contig whitelist, as opposed to a region blacklist. You can see an example in the fasta folder of the refdata-hg19-2.0.0 reference. It may be useful to exclude some contigs from the whitelist, such as decoys or alts.
To enable the display of the genes and exons tracks in the Loupe genome browser, download our gene annotations file into your reference. The annotation source can be found at ENSEMBL. This file will work regardless of the naming convention of your reference.
$ cd refdata-hsapiens-asm19 $ cd genes $ wget https://cf.10xgenomics.com/supp/genome/gene_annotations.gtf.gz
This step is optional, but if you omit this file, you will not be able to search by gene name in Loupe, or see the genes and exons tracks in the Loupe Haplotype view. Loupe will accept any GTF subject to the following requirements:
If you have followed the steps above correctly, your reference folder should now contain the following files:
$ tree refdata-hsapiens-asm19 refdata-hsapiens-asm19/ ├── fasta │ ├── genome.fa │ ├── genome.fa.amb │ ├── genome.fa.ann │ ├── genome.fa.bwt │ ├── genome.fa.fai │ ├── genome.fa.flat │ ├── genome.fa.gdx │ ├── genome.fa.pac │ └── genome.fa.sa ├── genes │ └── gene_annotations.gtf.gz ├── genome ├── regions └── snps 4 directories, 13 files
To run Long Ranger with your new reference, set the --reference
argument of longranger run to your new reference:
$ longranger run --reference=/path/to/refdata-hsapiens-asm19 ...