Software  ›   release notes

# Release Notes

## Supernova 2.1.1

### Bug fixes

• Fix an issue where the pipeline would crash (usually in the ASSEMBLER_IO stage) on CPUs without AVX support.

## Supernova 2.1.0

The core assembly algorithms remain unchanged, however, the results may vary slightly from Supernova 2.0.1 and some metrics are now measured differently (see below).

### Enhancements

• Supernova now estimates the genome size approximately 20% of the way through the assembly process and exits if the inferred raw coverage is very far from the recommended range of 38x to 56x. This is done to avoid long assembly runs at unintentionally low or high coverage. An option is provided to resume assembly at this point, although this action is generally not recommended. To avoid accidental use of an arbitrary default value for --maxreads, it is now a required argument. The maximum allowed value has been changed to 2.14 billion.
• Add metrics that estimate the GC and dinucleotide content of genomes, which can be useful for intepreting results.
• The metrics assembly_size, contig_N50, phase_block_N50, scaffold_N50, scaffold_1kb_plus and scaffold_10kb_plus are now computed in such a fashion that their values may be reproduced exactly from Supernova FASTA output. As a result, the reported metrics may vary slightly from metrics generated by the previous version of Supernova.
• The "ploidy histogram" has been removed from the summary.csv file, but is still available in the summary.json file.

### Bug fixes

• Fix an issue where both alleles at some loci were present in both pseudohap output files.
• Fix an integer overflow error that sometimes occurred in stages ASSEMBLER_CL and ASSEMBLER_PR after printing "building elocs".
• Fix a crash that sometimes occurred in stage ASSEMBLER_PR after printing "building new assembly".
• Fix deadlock occurring when memory allocation failed in a critical block.

### Resource Utilization

• Improve performance at several points in the code where highly repetitive genomes could cause slow execution.
• supernova mkoutput is now strictly single-threaded. Previously, a very small portion of the process was multi-threaded, leading to issues on multi-user systems where a given user may be allocated a restricted number of cores.

### Public datasets

• The publicly available assemblies have been replaced by Supernova 2.1.0 assemblies.

### Licensing

The source code for Supernova now has the MIT license.

## Supernova 2.0.1

### Enhancements

• Add new genome metric: ploidy_histogram
• Truncate large metadata files when generating a tarball for upload to 10x, rather than omitting them.

### Bug Fixes

• Fix an issue where supernova mkoutput would emit both reverse-complement and forward versions of the same pseudohap scaffold. It is safe to use 2.0.1 to generate new FASTA files from 2.0.0 assemblies. The new files show “ver=1.10” in the header.
• Fix an issue where unzipped FASTQ files were no longer accepted as input.

### Failures, Crashes, and Forensics

• Fix a number of failures in ASSEMBLER_M2 (viz. error messages regarding "TrimAdapter") related to unexpected read lengths. We still recommend against trimming or otherwise pre-processing Linked-Read data prior to running Supernova.
• Fix a number of crashes in the ASSEMBLER_DM and ASSEMBLER_ML stages (viz. “remove duplicate edges”, “computing division points” or “translating pairs to matches”) related to runs with very high coverage depth. We still recommend running Supernova with coverage between 38x and 56x for your genome.
• Fix a bug that caused the ASSEMBLER_ACP stage to crash.
• Fix a potential pipeline failure in ASSEMBLER_DF (viz. “Map/Reduce operation has failed at pass 0”) related to users using a different number of cores than we tested. Note that many parts of the Supernova pipeline are not capable of using more than 32 cores.
• Fix a condition where ASSEMBLER_PR could exit prematurely (viz. “unneeded vertices”).
• Fix a potential infinite loop in ASSEMBLER_CL (viz. “identifying redundant edges”).
• Certain failures to memory map files now print extensive diagnostic information.
• Some users who run the Supernova executables from a Lustre filesystem experience exec() failures (viz. “Re-exec to adjust stack size failed"). The software is now more robust and in case of failure will provide remedial guidance.

### Resource Utilization

• Fix an issue that caused unnecessary virtual memory use in the ASSEMBLER_DF stage. Note that Supernova uses memory-mapped files and therefore needs virtual address space (VMEM) that is generally larger than the maximum resident set size (RSS) of the process.
• Fix an issue that caused ASSEMBLER_PR to run very slowly on certain genomes (viz. “indexing closure paths”).

## Supernova 2.0.0

### Data generation

• Barcode subsampling is now deprecated. This also simplifies the workflow and reduces the amount of sequencing that is required.
• We provide a new 'optimized salting out' protocol that can be used to easily prepare DNA from a wide range of sample types and which we demonstrate on single insects.

### Bug fixes

• Remove 28 core limit.

### Resource utilization

• Memory usage has increased on average by about 10%. Nevertheless, of 20 test assemblies, 18 ran on a server having 256 GB RAM, and the remaining 2 ran on a server having 512 GB RAM. The 8 human assemblies in the set (all at about 56x coverage) ran on a server having 256 GB RAM, however it is possible that for stochastic reasons, some human datasets may require somewhat more memory.
• The mean run time for Supernova has increased, however the variance is lower. Several extreme run time phenotypes are gone.

### Usability

• Molecule length is now more accurately computed. A plot is now provided showing the inferred distribution of molecule lengths and in comparison to control samples. This replaces the previous estimation, histogram_molecules.json.
• Kmer histogram and a pdf plot is reintroduced.
• Several new metrics about the genome and data (including genome size) are now computed.
• Total wallclock time for assembly is now shown in the text summary file.
• An alert is now issued if the estimated genome size seems too low or too high.
• An alert is now issued if coverage seems too low or too high.
• Alerts are now shown in the text summary file.
• The representation of cycles in FASTA output has been improved.