10x Genomics
Chromium Genome & Exome

Long Ranger2.2, printed on 03/06/2025

Targeted Phasing and SV Calling

Analysis software for 10x Genomics linked read products is no longer supported. Raw data processing pipelines and visualization tools are available for download and can be used for analyzing legacy data from 10x Genomics kits in accordance with our end user licensing agreement without support.

Long Ranger's Targeted Mode analyzes sequencing data from a Chromium-prepared, targeted library. Generally this is an exome hybrid capture, but targeted mode is compatible with any pull-down panel. This involves the following steps:

Run longranger mkfastq on the Illumina BCL output folder to generate FASTQ files.
Run longranger targeted on each targeted sample that was demultiplexed by longranger mkfastq.

For the following example, assume that the Illumina BCL output is in a folder named /sequencing/140101_D00123_0111_AHAWT7ADXX.

Target Files

The target BED file supplied to the pipeline via the --targets is used in computing metrics such as on-target coverage. The SV and CNV calling algorithms in Long Ranger also use the target BED file to define regions of interest. For Agilent SureSelect Human All Exon V6, we strongly recommend using the latest BED file released by Agilent. The BED file is available as SureSelect Human All Exon V6 r2 from Agilent.

Long Ranger includes a CNV caller that detects exon-scale deletions in targeted regions. It is important to supply the --cnvfilter argument a BED file which masks problematic regions where baits perform poorly, to prevent false-positive calls. We have created a cnvfilter file tailored to the SureSelect Human All Exon V6 r2 BED file, which is available for download: Agilent Exome V6 r2 CNV Filter BED. We focus on the following cases:

On-target regions that are systematically poorly covered in nearly all individuals. These were curated by 10x Genomics on a set of 20 samples. We recommend a similar analysis when working with a new bait set. This prevents us making false-positive deletion calls in these areas. These regions cover ~1 Mb of Agilent V6.
Segmental duplications larger than 100kb at >98% identity that intersect on-target regions. These were drawn from the UCSC Segmental Dups track and cover ~1 Mb of Agilent V6.
We also include extra regions included in the Agilent Phased Exome that are not included in Agilent V6. These regions cover about 20 Mb of the genome. If you're interested in using the Agilent Phased Exome, we recommend determining which targets in the Agilent Phased Exome perform well and removing them from the blacklist.

Run longranger mkfastq

First, follow the instructions on running longranger mkfastq to generate FASTQ files. For example, if the flowcell serial number was HAWT7ADXX, then longranger mkfastq will output FASTQ files in HAWT7ADXX/outs/fastq_path.

Run longranger targeted

To run Long Ranger in targeted mode, use the longranger targeted command with a .bed file as the --targets argument, plus the following common parameters. For a complete list of command-line options, run longranger targeted --help.

For help on which arguments to use to target a particular set of FASTQs, consult Running 10x Pipelines on FASTQ Files.

Argument	Description
`--id`	A unique run ID string: e.g. `sample345`
`--fastqs`	Path of the FASTQ folder generated by `longranger mkfastq`, e.g. `/home/jdoe/runs/HAWT7ADXX/outs/fastq_path`
`--vcmode`	(required, except when specifying `--precalled`) Must be one of: `freebayes`, `gatk:/path/to/GenomeAnalysisTK.jar`, or `disable`
`--sample`	(optional) Sample name as specified in the sample sheet supplied to `mkfastq`.
`--downsample`	(optional) Specify the maximum amount of sequencing data to be used by the pipeline, in gigabases. If more data is available than this request, reads will be randomly downsampled. If less data is available, this option will have no effect.
`--reference`	Path to a 10x compatible reference, e.g. `/opt/refdata-hg19-2.1.0`. See Installation for how to download and install the default reference.
`--targets`	BED file associated with the pulldown used for this Chromium library e.g. `/home/jdoe/runs/agilent_exome.bed`. See `Target Files` above for details.
`--cnvfilter`	A BED file indicating poorly performing targets or problematic genomic regions that should not generate CNV calls. See `Target Files` above for details
`--precalled`	(optional) Path to a "pre-called" VCF file. Variants in this file will be phased. When setting `--precalled` do not specify a `--vcmode`
`--sex`	(optional) Sex of the sample: `male` or `female`. Sex will be detected based on coverage if not supplied.
`--somatic`	(optional) Supply this flag for somatic samples. This will increase the sensitivity of the large-scale SV caller for somatic SVs, by allowing the detection of sub-haplotype events. Note: this option currently does not affect small-scale variant calling. The small scale variant caller is not currently optimized for somatic variants

After determining these input arguments, run longranger targeted:

$ cd /home/jdoe/runs
$ longranger targeted --id=sample345 \
                 --reference=/opt/refdata-hg19-2.1.0 \
                 --fastqs=/home/jdoe/runs/HAWT7ADXX/outs/fastq_path \
                 --targets=/home/jdoe/runs/agilent_exome.bed \
                 --cnvfilter=/home/jdoe/runs/agilent_v6r2_cnvfilter.bed

Following a set of preflight checks to validate input arguments, Long Ranger pipeline stages will begin to run:

longranger targeted 2.2.2
Copyright (c) 2016 10x Genomics, Inc.  All rights reserved.
-----------------------------------------------------------------------------
Martian Runtime - 2.3.2
 
Running preflight checks (please wait)...
2016-05-01 12:00:00 [runtime] (ready)           ID.sample345.PHASER_SVCALLER_CS.PHASER_SVCALLER._ALIGNER.SETUP_CHUNKS
2016-05-01 12:00:00 [runtime] (run:local)       ID.sample345.PHASER_SVCALLER_CS.PHASER_SVCALLER._SNPINDEL_PHASER.SORT_GROUND_TRUTH
2016-05-01 12:00:00 [runtime] (run:local)       ID.sample345.PHASER_SVCALLER_CS.PHASER_SVCALLER._SNPINDEL_PHASER.SORT_GROUND_TRUTH.fork0.chnk0.main
...

By default, longranger targeted will use all of the cores available on your system to execute pipeline stages. You can specify a different number of cores to use with the --localcores option; for example, --localcores=16 will limit Long Ranger to using up to sixteen cores at once. Similarly, --localmem will restrict the amount of memory (in GB) used by longranger targeted.

The pipeline will create a new folder named with the sample ID you specified (e.g. /home/jdoe/runs/sample345) for its output. If this folder already exists, Long Ranger will assume it is an existing pipestance and attempt to resume running it.

Output Files

A successful longranger targeted execution should conclude with a message similar to this:

2016-05-02 15:46:41 [runtime] (run:local)       ID.sample345.PHASER_SVCALLER_CS.PHASER_SVCALLER.LOUPE_PREPROCESS.fork0.join
2016-05-02 15:46:44 [runtime] (join_complete)   ID.sample345.PHASER_SVCALLER_CS.PHASER_SVCALLER.LOUPE_PREPROCESS
2016-05-02 15:46:55 [runtime] VDR killed 4738 files, 223GB.
 
Outputs:
- Run summary:               /home/jdoe/runs/sample345/outs/summary.csv
- BAM barcoded:              /home/jdoe/runs/sample345/outs/phased_possorted_bam.bam
- BAM index:                 /home/jdoe/runs/sample345/outs/phased_possorted_bam.bam.bai
- VCF phased:                /home/jdoe/runs/sample345/outs/phased_variants.vcf.gz
- VCF index:                 /home/jdoe/runs/sample345/outs/phased_variants.vcf.gz.tbi
- Large-scale SV calls:      /home/jdoe/runs/sample345/outs/large_sv_calls.bedpe
- Large-scale SV candidates: /home/jdoe/runs/sample345/outs/large_sv_candidates.bedpe
- Large-scale SVs:           /home/jdoe/runs/sample345/outs/large_svs.vcf.gz
- Large-scale SVs index:     /home/jdoe/runs/sample345/outs/large_svs.vcf.gz.tbi
- Mid-scale deletions:       /home/jdoe/runs/sample345/outs/dels.vcf.gz
- Mid-scale deletions index: /home/jdoe/runs/sample345/outs/dels.vcf.gz.tbi
- Loupe file:                /home/jdoe/runs/sample345/outs/loupe.loupe
 
Pipestance completed successfully!

The output of the pipeline will be contained in a folder named with the sample ID you specified (e.g. sample345). The subfolder named outs will contain the main pipeline output files:

File Name	Description
`summary.csv`	Run summary metrics in CSV format
`phased_possorted_bam.bam`	Aligned reads annotated with barcode information
`phased_possorted_bam.bam.bai`	Index for `phased_possorted_bam.bam`
`phased_variants.vcf.gz`	VCF annotated with barcode and phasing information
`phased_variants.vcf.gz.tbi`	Index for `phased_variants.vcf.gz`
`large_sv_calls.bedpe`	Confidently called large-scale structural variants (greater than the 97.5^th percentile of the molecule size distribution or inter-chromosomal) in BEDPE format
`large_sv_candidates.bedpe`	Large-scale structural variant calls and low confidence candidates in BEDPE format
`large_svs.vcf.gz`	Large-scale structural variant calls and candidates in VCF format
`large_svs.vcf.gz.tbi`	Index for `large_svs.vcf.gz`
`dels.vcf.gz`	Exon deletion calls
`dels.vcf.gz.tbi`	Index for `dels.vcf.gz`
`loupe.loupe`	File that can be opened in the Loupe genome browser

Once longranger targeted has successfully completed, you can browse the resulting .loupe file in the Loupe genome browser, or refer to the Understanding Output section to explore the data by hand.

10x Genomics
Chromium Genome & Exome

Targeted Phasing and SV Calling

Target Files

Run longranger mkfastq

Run longranger targeted

Output Files

About

Legal Notices

Resources

Headquarters

Social

10x GenomicsChromium Genome & Exome

Targeted Phasing and SV Calling

Target Files

Run longranger mkfastq

Run longranger targeted

Output Files

10x Genomics
Chromium Genome & Exome