$supernova run --id=sample345 \ --fastqs=/home/jdoe/runs/HAWT7ADXX/outs/fastq_path  Following a set of preflight checks to validate input arguments, Supernova pipeline stages will begin to run: supernova run Copyright (c) 2016 10x Genomics, Inc. All rights reserved. ----------------------------------------------------------------------------- Martian Runtime - v2.3.3 Running preflight checks (please wait)... 2016-01-01 00:00:08 [runtime] (ready) ID.sample345.ASSEMBLER_CS._ASSEMBLER.ASSEMBLER_PREFLIGHT 2016-01-01 00:00:08 [perform] Serializing pipestance performance data. 2016-01-01 00:00:01 [runtime] (split_complete) ID.sample345.ASSEMBLER_CS._ASSEMBLER.ASSEMBLER_PREFLIGHT 2016-01-01 00:00:01 [runtime] (run:local) ID.sample345.ASSEMBLER_CS._ASSEMBLER.ASSEMBLER_PREFLIGHT.fork0.chnk0.main 2016-01-01 00:00:07 [runtime] (chunks_complete) ID.sample345.ASSEMBLER_CS._ASSEMBLER.ASSEMBLER_PREFLIGHT 2016-01-01 00:00:10 [runtime] (join_complete) ID.sample345.ASSEMBLER_CS._ASSEMBLER.ASSEMBLER_PREFLIGHT 2016-01-01 00:00:11 [runtime] (ready) ID.sample345.ASSEMBLER_CS._ASSEMBLER._ASSEMBLER_PREP._FASTQ_PREP_NEW.SETUP_CHUNKS ... supernova run will use all of the sequence data available in the FASTQ folder, up to the limit imposed by the --maxreads option, described above. If you are processing data prepared with the older, deprecated supernova demux process, you can also specify --indices and --lanes to further select the data to be processed. For new datasets, this selection is performed in the samplesheet provided to supernova mkfastq. supernova run assumes that all of the cores on your system are available for its use, but you can use the --localcores option to limit this. Similarly, supernova run assumes that all of the memory on your system is available for its use. You can use --localmem to suggest limits, however memory utilization in certain sections of the code will scale with the size of the genome, the number of input reads and the quality of the data, and may exceed this limit. The pipeline will create a new folder named with the sample ID you specified (e.g. /home/jdoe/runs/sample345) for its output. If this folder already exists, supernova run will assume it is an existing pipestance and attempt to resume running it. ## Watching Supernova Progress The standard output from supernova run displays lines that indicate the progress through pipeline stages as shown in Map of the Pipeline. The standard output will pause during individual stages with a message such as: ... 2016-01-03 00:00:01 [runtime] (run:local) ID.sample345.ASSEMBLER_CS._ASSEMBLER.ASSEMBLER_DF.fork0.chnk0.main  and may appear to have stalled. However you should see a heartbeat message every 6 minutes, such as: 2016-01-03 00:06:01 [runtime] (update) ID.sample345.ASSEMBLER_CS._ASSEMBLER.ASSEMBLER_DF.fork0 chunks_running  If you wish to monitor the progress of one of these stages, you can view the stage-specific standard output: e.g. $ cd /home/jdoe/runs
\$ tail sample345/ASSEMBLER_CS/_ASSEMBLER/ASSEMBLER_DF/fork0/chnk0/_stdout


and likewise for other ASSEMBLER stages.

## Extreme Coverage Testing

Roughly 20% of the way through the assembly process, Supernova estimates the raw coverage of the genome, and exits if the coverage is not between 30x and 85x. The reasons for this test are that:

• Very low or very high coverage are likely to yield suboptimal results, and may cause Supernova to run unusually long or crash.

• The actual recommended coverage is between 38x and 56x, although somewhat higher coverage is sometimes helpful. Thus the test should only catch cases that are significantly out of range.

• It is possible to accidentally provide Supernova with an inappropriate number of input reads, in which case the mistake may be caught here, saving time.

If the coverage test fails, you have two options:

1. Restart using the --maxreads option, providing a lower value than you originally specified, matching the appropriate level of coverage. This is the recommended action.

2. Override the test by restarting with the option --accept-extreme-coverage. This is not recommended. If you do this, the assembly will continue at the point where it left off.

## Output Files

A successful supernova run execution should conclude with a message that looks similar to this:

...
2016-01-03 00:00:01 [runtime] (run:local)       ID.sample345.ASSEMBLER_CS._ASSEMBLER.ASSEMBLER_PR.fork0.join
2016-01-03 00:00:01 [runtime] (chunks_complete) ID.sample345.ASSEMBLER_CS._ASSEMBLER.ASSEMBLER_PR
2016-01-03 00:00:03 [runtime] (join_complete)   ID.sample345.ASSEMBLER_CS._ASSEMBLER.ASSEMBLER_PR

Outputs:
- Run summary:        /home/jdoe/runs/sample345/outs/summary.csv
- Run report:         /home/jdoe/runs/sample345/outs/report.txt
- Raw assembly files: /home/jdoe/runs/sample345/outs/assembly

Pipestance completed successfully!
Saving pipestance info to sample345/sample345.mri.tgz

The output of the pipeline will be contained in a folder named with the sample ID you specified (e.g. sample345). The subfolder named outs will contain the main pipeline output files that are described in more detail in Output Overview.

## Run supernova mkoutput for FASTA output

First, familiarize yourself with the representation of a genome assembly as a graph structure. Next, follow the instructions on running supernova mkoutput to generate FASTA files.