10x Genomics
Chromium De Novo Assembly

Supernova1.0, printed on 03/12/2025

Assembly Process

Supernova generates highly-contiguous, phased, whole-genome de novo assemblies from a Chromium-prepared library.

Supernova should be run with at most 1.2 billion reads, and at 38-56x coverage of the genome. Please see Sample Requirements and System Requirements before creating your Chromium libraries for assembly.

This involves the following steps:

Run supernova demux on the Illumina BCL output folder to generate FASTQ files.
Run supernova run separately for each sample that was demultiplexed by supernova demux to generate a whole genome de novo assembly.
Run supernova mkfasta in order to generate various styles of FASTA output for your assemblies.

For the following example, assume that the Illumina BCL output is in a folder named /sequencing/140101_D00123_0111_AHAWT7ADXX.

Run supernova demux

First, follow the instructions on running supernova demux to generate FASTQ files. For example, if the flowcell serial number was HAWT7ADXX, then supernova demux will output FASTQ files in HAWT7ADXX/outs/fastq_path.

Run supernova run for de novo assembly

To run Supernova, you use the supernova run command, with the following parameters:

Argument	Description
`--id`	A unique run ID string: e.g. `sample345`
`--fastqs`	Path of the FASTQ folder generated by `supernova demux` e.g. `/home/jdoe/runs/HAWT7ADXX/outs/fastq_path`
`--description`	(optional) Description of the data set. This will be included, along with the run ID string, in various output files.
`--fastqprefix`	(optional) Sample name as specified in the sample sheet supplied to `bcl2fastq`. See Demultiplexing with bcl2fastq for more information.
`--indices`	(optional) Sample indices associated with this sample. Comma-separated list of: index set plate wells: `SI-GA-A1,SI-GA-H12` index sequences: `TCGCCATA,GTATACAC`
`--lanes`	(optional) Lanes associated with this sample
`--reads`	(optional) Specify number of reads to downsample to, if a surplus of data is available.

After determining these input arguments, call supernova run:

$ cd /home/jdoe/runs
$ supernova run --id=sample345 \
                --fastqs=/home/jdoe/runs/HAWT7ADXX/outs/fastq_path \
                --indices=SI-GA-A1

Following a set of preflight checks to validate input arguments, Supernova pipeline stages will begin to run:

supernova run 
Copyright (c) 2016 10x Genomics, Inc.  All rights reserved.
-----------------------------------------------------------------------------
Martian Runtime - 2.0.0
 
Running preflight checks (please wait)...
2016-01-01 00:00:01 [runtime] (ready)           ID.sample345.ASSEMBLER_CS._ASSEMBLER_PREP.SETUP_CHUNKS
2016-01-01 00:00:01 [runtime] (split_complete)  ID.sample345.ASSEMBLER_CS._ASSEMBLER_PREP.SETUP_CHUNKS
...

As a default, supernova run will use all of the sequence data available in the FASTQ folder with the specified --indices and --lanes. If you would like to downsample the data you can optionally specify the number of reads that Supernova should assemble using the --reads. By default, supernova run will use all of the cores available on your system to execute pipeline stages.

The pipeline will create a new folder named with the sample ID you specified (e.g. /home/jdoe/runs/sample345) for its output. If this folder already exists, supernova run will assume it is an existing pipestance and attempt to resume running it.

Output Files

A successful supernova run execution should conclude with a message that looks similar to this:

...
2016-01-03 00:00:01 [runtime] (chunks_complete) ID.sample345.ASSEMBLER_CS._ASSEMBLER_CP
2016-01-03 00:00:01 [runtime] (run:local)       ID.sample345.ASSEMBLER_CS._ASSEMBLER_CP.fork0.join
2016-01-03 00:00:03 [runtime] (join_complete)   ID.sample345.ASSEMBLER_CS._ASSEMBLER_CP
 
Outputs:
- assembly: /home/jdoe/runs/sample345/outs/assembly
- summary: /home/jdoe/runs/sample345/outs/summary.csv
- report: /home/jdoe/runs/sample345/outs/report.txt
 
Pipestance completed successfully!
Saving pipestance info to sample345/sample345.mri.tgz

The output of the pipeline will be contained in a folder named with the sample ID you specified (e.g. sample345). The subfolder named outs will contain the main pipeline output files:

File Name	Description
`summary.csv`	Run summary metrics in CSV format
`report.txt`	Extensive assembly metrics in human-readable form
`assembly`	The directory containing the assembly in binary format

Run supernova mkfasta for FASTA output

First, familiarize yourself with the representation of a genome assembly as a graph structure. Next, follow the instructions on running supernova mkfasta to generate FASTA files.

10x Genomics
Chromium De Novo Assembly

Assembly Process

Run supernova demux

Run supernova run for de novo assembly

Output Files

Run supernova mkfasta for FASTA output

About

Legal Notices

Resources

Headquarters

Social

10x GenomicsChromium De Novo Assembly

Assembly Process

Run supernova demux

Run supernova run for de novo assembly

Output Files

Run supernova mkfasta for FASTA output

10x Genomics
Chromium De Novo Assembly