HOME  ›   release notes

# Release Notes for Cell Ranger 2.0 Gene Expression

## Cell Ranger 2.0.2 Gene Expression

### Bug fixes

• Properly ignore SIGHUP when a pipeline is run using nohup.

## Cell Ranger 2.0.1 Gene Expression

### Pipeline Argument Changes

• Add --override option to all pipelines, allowing for stage-level overrides for cores and memory.
• Reanalyze no longer requires --agg to persist library ID; it is only required for persisting user-defined fields.

### Bug fixes

• Fix CHUNK_READS using more cores and using them less efficiently than intended.
• Fix aggr using incorrect downsampling rates when more than 10 libraries are aggregated.
• Fix mkfastq proceeding even after bcl2fastq is killed.
• Fix lack of robustness to rare events where NFS latency induces double file deletion or double directory creation events.
• Fix ALIGN_READS proceeding after the STAR subprocess fails, causing crashes in ATTACH_BCS_AND_UMIS.
• Improve error messages when STAR or samtools fail in ALIGN_READS.
• Fix spaces in transcript IDs causing ATTACH_BCS_AND_UMIS to crash. mkref no longer allows spaces in transcript IDs.
• Fix out-of-memory condition in ATTACH_BCS_AND_UMIS for some libraries with >800M reads.
• Fix question marks replacing axis titles of barcode rank plot in web summary.
• Fix excessive memory consumption and runtime of mkfastq on large sample sheets.

### Job Scheduling

• Fix several cases where, after mrp (which is invoked by cellranger) gets killed, it was not able to restart correctly.
• On SGE clusters, cellranger/mrp now periodically runs qstat to verify that the jobs it queued have not been killed or canceled.
• If the run fails, instead of just displaying a message pointing the user to the relevant _errors file, the contents of that file is printed.
• On automatic retry of failed stages, the reason for the original failure is logged.
• mrp is now more resilient against certain kinds of filesystem errors.
• In the event of certain types of filesystem problems (such as permissions errors or disk quota), mrp/cellranger should now sometimes be able to provide more useful and immediate error messages.
• Additional information about the environment cellranger runs in is now logged and included in mri.tgz.
• Additional information about the environment the analysis runs in is now logged and included in mri.tgz.
• mrp now correctly handles the signals sent by SGE and LSF when a soft time limit is reached (e.g. for SGE, -l s_rt 23:00:00).
• Now supports --overrides method to dynamically change additional CPU and memory per stage.

## Cell Ranger 2.0.0 Gene Expression

### Pipeline Argument Changes

• Add --barcodes and --genes options to reanalyze, which allow selection of a specific subset of barcodes and/or genes to use in the secondary analysis.
• Add --force-cells option to count and reanalyze to explicitly set the cell count. If specified, Cell Ranger will take the top N barcodes (by UMI count) as cells instead of doing dynamic cell count estimation.
• Rename the estimated cells option from --cells to --expect-cells for clarity.
• Add --nosecondary flag to count, which skips the secondary analysis.
• Disallow slashes in the --genome argument in mkref.
• Add --id option to mkfastq which allows you to name the output directory.

### New Subcommands

• Add cellranger mat2csv command, which converts a Cell Ranger sparse gene-barcode matrix to a dense CSV format. Note that the resulting file will be very large, even for a few hundred cells.

### Web Summary Changes

• Add "Reads Mapped Antisense to Gene" metric, which quantifies reads that are mapped to the non-coding strand of a gene. High values can indicate the use of an unsupported chemistry type, e.g. passing a Single Cell V(D)J library to cellranger count.
• Add "Fraction GEMs with >1 Cell (Lower / Upper Bound)" metrics, which define a confidence interval for the multiplet rate estimate in multi-genome samples.
• Add more details to various metric descriptions.

### Algorithm Improvements

• Add the requirement that reads overlap annotated exons by at least 50% in order to be considered exonic. As a result, "Reads Mapped Confidently to Exonic Regions" may differ slightly from previous versions.
• Reduce EXTRACT_READS per-read runtime by 50% by avoiding OrderedDict and caching metric calculations.
• Reduce SUBSAMPLE_READS runtime by reducing the number of fixed target values for subsampling (to just 25k and 50k reads per cell).

### File Format Improvements

• Due to a format change (removal of the IntervalTree object), references produced with cellranger mkref using Cell Ranger 2.0 are not compatible with pipelines from Cell Ranger 1.x.
• Modify the TX, GX, and GN tags to have more granular transcript / gene annotations. Each BAM record is only annotated with transcripts / genes specific to that alignment, instead of combining annotations from all alignments of the corresponding read.
• Add RE tag, which indicates whether an alignment is exonic, intronic or intergenic.

### Bug fixes

• Fix rare bug in interval arithmetic, leading to exonic reads being falsely annotated as intronic or intergenic. As a result of this bugfix, "Reads Mapped Confidently to Exonic Regions" may differ slightly from previous versions.
• Fix excessive EXTRACT_READS runtime (10+ hours) on very large FASTQs such as those produced by mkfastq.
• Fix a crash in RUN_GRAPH_CLUSTERING on filesystems that do not support named pipes.
• Fix SUBSAMPLE_READS using more VMEM than expected, causing it to be killed by SGE when exceeding the h_vmem limit on certain clusters.
• Fix mkfastq not merging output files properly due to sample numbering issues.
• Fix mkfastq crash due to -d (demultiplexing-threads) argument being deprecated in bcl2fastq 2.19.
• Fix the components.csv file produced by PCA, which did not contain the correct matrix.
• Fix a crash in RUN_PCA when the number of nonzero genes is smaller than the number of principal components.
• Fix a crash in mkref with very large genomes; use the limitGenomeGenerateRAM option in STAR to overcome its default reference size limit.
• Fix certain special characters (like dashes) in reference names breaking the subsampled genes detected plot.
• Fix mkloupe displaying an unhelpful error message when run on mixed-species runs and those from Cell Ranger 1.1 or earlier.
• Fix the open-file-handle-limit check using the submit host rather than the execution machine.
• Fix cellranger aggr allowing duplicate library_ids.
• Fix CLOUPE_PREPROCESS taking the full matrix even after reanalyze subselects barcodes.
• Fix a crash in mkfastq on RunInfo.xml files produced by the NovaSeq.
• Fix a crash in mkfastq when bcl2fastq 2.19 is used in cluster mode or with the --demultiplexing-threads argument.
• Fix mkfastq sometimes not properly merging samples in bcl2fastq 2.18 and 2.19 due to a change in the order in which lanes are processed by bcl2fastq.

### Martian Runtime Changes

• Add caching for deserialized json metadata. This improves performance for stages with many chunks.

### Miscellaneous

• Update samtools from 0.1.19 to 1.4.
• Rename RUN_PREPROCESS to PREPROCESS_MATRIX in the SC_RNA_ANALYZER pipeline.
• Add alerts.json as an output of the SUMMARIZE_REPORTS stage. This file is a machine-readable list of any abnormal metric values that raised alarms in the web summary.
• For multi-genome samples, display the full reference name rather than a comma delimited list of genomes in the web summary ("hg19, mm10" becomes "hg19_and_mm10").