HOME  ›   pipelines
If your question is not answered here, please email us at:  ${email.software}

10x Genomics
Chromium De Novo Assembly

Troubleshooting Supernova

If the supernova mkfastq or supernova run pipelines fail, they will automatically generate a "debug tarball" named sample_id.mri.tgz that contains the logs and metadata generated by the pipestance leading up to failure. This file will make it immensely easier to diagnose an assembly problem. It should not contain confidential information about a particular sample. In order to send this tarball to 10x, use the supernova upload command:

$ supernova upload [email protected] sample_id.mri.tgz

If you are unable to use the mechanism above you can email the debug tarball file directly to the 10x software team to help resolve any issues with using Supernova.

Diagnosing Supernova issues

Understanding Pipeline Failures

If you wish to troubleshoot a pipeline failure yourself, it is important to identify if you are experiencing a preflight failure, an in-flight failure, or an alert.

Preflight Failures

Preflight failures are the most common and are the result of invalid input data or runtime parameters. Because they occur before the pipeline actually runs, there will be no pipeline output and the error is reported directly to your terminal.

supernova mkfastq will generate the following error if Illumina's bcl2fastq software is not installed:

[error] No bcl2fastq found on path. demux requires bcl2fastq v2.17 or greater for RTA version: 2.7.3

In-Flight Failures

In-flight failures may be the result of factors external to the pipeline such as running out of system memory or disk space, or may be due to issues with the data that could not be detected prior to de novo assembly. Different stages may fail in different ways, so the specific error messages vary widely.

Finding relevant error logs

There are a few important files that are saved to your pipeline output directory which, by default, is named according to the flowcell serial number for supernova mkfastq (e.g., HAWT7ADXX) and your --id name for supernova run.

  1. The pipeline execution log that is output to your terminal during pipeline execution is also saved to output_dir/_log.

  2. Stages that experience a hard failure generate an _errors file containing the precise error that caused a stage to halt. You view these error logs, if they exist, using find output_dir -name _errors | xargs cat

  3. Each stage also logs its stdout and stderr streams to _stderr and _stdout files.
    These logs can be listed using find output_dir -name _stderr and may contain elucidating error messages in certain stages that call separate binaries, such as ASSEMBLER_DF and ASSEMBLER_CP.

A more detailed description of the pipeline output directory and its contents is given in the Pipestance Structure page.

Resuming a failed pipestance

Once you have determined the reason for failure and are ready to continue running the pipeline, you can typically issue the same supernova run or supernova mkfastq command to continue execution of the pipestance from the stage that originally failed.

When supernova run or supernova mkfastq is run, it will detect if its intended output directory already exists. If it does, this existing pipeline output directory will be treated as an incomplete pipestance and resume execution. This feature allows pipelines to be stopped and resumed with great flexibility, but it can also result in errors such as:

RuntimeError: /home/jdoe/runs/sample345 is not a pipestance directory

which indicates that you specified a --id that corresponds to an existing directory that was not created by supernova run.

The following error:

RuntimeError: pipestance 'HAWT7ADXX' already exists and is locked by another Martian instance. If you are sure no other Martian instance is running, delete the _lock file in /home/jdoe/runs/HAWT7ADXX and start Martian again.

indicates that you may already have a copy of supernova run or supernova mkfastq running that is using the same output directory. If you are sure that there is no pipestance running in the given output directory, you can either remove that output directory entirely (mv HAWT7ADXX HAWT7ADXX.old) to restart the pipestance from the beginning, or you can remove the pipestance's lock file (rm HAWT7ADXX/_lock) and re-run the supernova run command to resume pipeline execution.

If you encounter the following error when attempting to resume a pipestance:

RuntimeError: pipestance 'sample345' already exists with different invocation file /home/jdoe/runs/sample345/_invocation

you are attempting to resume a pipestance using command-line arguments that are different from those used to first run it. You can view the parameters input to the existing pipeline by examining the _log file located in the output directory (e.g., head -n20 /home/jdoe/runs/sample345/_log)

Alerts

During de novo assembly supernova run collects metrics that characterize the quality of input data. These metrics capture the quality of various stages of the data preparation workflow, including library preparation and sequencing. When the data are less than ideal we raise alerts that are displayed in the pipeline output after pipeline execution completes.

For example, if the user runs Supernova with paired-end reads of length 140 bases, the following alert is displayed:

Alerts:
We observe many reads shorter than 150 bases.The ideal read length for Supernova is 150 bases. Reads shorter than the ideal length are likely to yield a lower quality assembly.

In rare cases, when we detect serious issues with the input data that render the output of supernova run to be completely unreliable, we terminate execution. For example, if we find that a large majority of the reads do not have valid 10x barcodes, we exit with the following message:

[error] The fraction of input reads having valid barcodes is 20.3 percent, whereas the ideal is at least 80 percent. This condition could have multiple causes including wrong library type, failed library construction and low sequence quality on the barcode bases. This could have a severe effect on assembly performance, and Supernova has not been tested on data with these properties, so execution will be terminated.