If your question is not answered here, please email us at:  ${email.software}

Targeted Gene Expression

10x Genomics
Targeted Gene Expression

Descriptions of Target Gene Panel Downloads

In the panel selection overview we provide links to pages containing the following files for all predesigned target gene panels:

FileDescription
Gene Metadata File A TSV file that lists all the genes included in the final target gene panel.
Target Panel CSV File This CSV file is a required input for Cell Ranger to enable analysis of targeted GEX data. It specifies details of the target genes and bait sequences in the target gene panel.
Bait BED File A BED12 file containing containing the sequences and genomic coordinates for all baits in the target gene panel. Use this file to visualize the bait locations on genome browsers like IGV (Integrated Genomics Viewer) and the UCSC Genome Browser.

Customized designs created through the 10x Genomics Custom Panel Designer will include each of the files above, as well as the following additional files:

FileDescription
Bait Design File Use this Excel (.xlsx) file to order your custom or add-on panel from a compatible oligo provider. It contains the bait sequences for add-on genes and custom sequences.
Custom Sequence GTF File A GTF file corresponding to the custom input sequences entered for panel design. Use this file to create a custom reference compatible with Cell Ranger by appending its contents to the transcriptome GTF file.
Custom Sequence FASTA File A FASTA file corresponding to the custom input sequences entered for panel design. Use this file to create a custom reference compatible with Cell Ranger by appending its contents to the reference FASTA file.

Note About Bait Identifiers

Files containing information about individual baits have a column corresponding to the bait identifier (ID) that uniquely identifies each bait and is useful for matching up entries across different files as needed. Bait IDs take the following format:

gene_id|gene_name|bait_number

For example, the first bait listed for the gene TSPAN6, which has the Ensembl ID ENSG00000000003 in the GRCh38-2020-A reference, would have the bait ID:

ENSG00000000003|TSPAN6|1

The first bait listed for BRCA1 would be:

ENSG00000012048|BRCA1|1

For custom sequences, the gene_id and gene_name fields within the bait ID will be the same and match the name provided for each sequence.

File Formats for Panel Downloads

Target Panel CSV File

This CSV file is a required input for Cell Ranger to enable analysis of targeted GEX data. It specifies the target genes and bait sequences that make up the target gene panel. See a description of the --target-panel argument to cellranger count in the cellranger count documentation.

The following is a portion of an example target panel file:

#panel_name=Human Gene Signature Panel
#panel_type=predesigned
#reference_genome=GRCh38
#reference_version=2020-A
#target_panel_file_format=1.0
gene_id,bait_seq,bait_id
ENSG00000000003,AGTTG[...]GCGTC,ENSG00000000003|TSPAN6|1
ENSG00000000003,CCCGT[...]GGCAA,ENSG00000000003|TSPAN6|2
ENSG00000000003,GGTGA[...]ACCTG,ENSG00000000003|TSPAN6|3
[ ... ]

The columns for this file are the following:

Column NameDescription
gene_id The Ensembl gene identifier associated with this bait.
bait_seq The nucleotide sequence of the bait.
bait_id The bait ID associated with this bait formatted as described above.

The file also contains a number of required metadata fields in the header:

Metadata FieldDescription
panel_name This is the name of the panel.
panel_type Indicates whether this was a predesigned or custom panel. One of predesigned, custom_, or fully_custom.
reference_genome The genome build of cellranger reference baits were designed against.
reference_version The version of the cellranger reference baits were designed against.
target_panel_file_format The version of the target_panel file format specification this file conforms to.

These metadata columns take the format:

#key=value

In general, we strongly recommend the use of target panel CSV files we have provided for download or those generated via the 10x Genomics Custom Panel Designer. If you do need to make your own target panel, you must provide entries for all metadata fields. The gene_id column is required, whereas the bait_seq and bait_id columns are strongly recommended but optional. The bait_id values do not need to conform to the format outlined above, but they are required to be unique to each bait.

Bait BED File

A BED12-formatted file (12 columns detailed below) file containing the sequences and genomic coordinates for all baits in the target gene panel. Use this file to visualize the bait locations on genome browsers like IGV (Integrated Genomics Viewer) and the UCSC Genome Browser or to perform custom analyses.

The following is a portion of an example BED12 file:

chrX    100636685       100636805       ENSG00000000003|TSPAN6|1        0       -       100636685       100636694       0       1       120     0
chrX    100635704       100636685       ENSG00000000003|TSPAN6|2        0       -       100635704       100636685       0       2       42,78   0,903
chrX    100635584       100635704       ENSG00000000003|TSPAN6|3        0       -       100635584       100635704       0       1       120     0

The columns of BED12 files we provide are as follows (adapted from UCSC Genome Browser documentation):

Column NameDescription
chromosome Chromosome of the target gene.
chromStart 0-based start coordinate of the targeted sequence on the chromosome.
chromEnd 0-based non-inclusive end coordinate on the chromosome.
name Bait ID as described above.
score Set to 0 for all entries.
strand + or - to indicate the strand of the targeted gene.
thickStart The starting position at which the feature is drawn as a thick line in browsers (matches display of the corresponding transcript region).
thickEnd The ending position at which the feature is drawn as a thick line in browsers (matches display of the corresponding transcript region).
itemRgb Set to 0 for all entries.
blockCount The number of blocks (continuous intervals).
blockSizes Comma-separated list of the block sizes, contains blockCount entries.
blockStarts Comma-separated list of block starts relative to chromStart column, contains blockCount entries.

BED12 format was chosen because it allows baits that span splice junctions to be conveniently represented on a single line and allows genome browsers to visualize links between regions of baits that are discontinuous in genomic space. Browsers such as UCSC Genome Browser or IGV will render BED12 files appropriately, appearing in a visually similar manner to how transcripts in the genome are displayed.

This format is also well-supported by command-line tools. For example, bedtools provides a -split command-line flag for some subcommands to allow the individual blocks within each line of a BED12 file to be treated independently as needed. This can be useful for calculating intersections, for example, where you may be interested in intersections with the regions covered by the baits themselves rather than intersections with the entire genomic interval the bait coordinates span including intronic regions. bedtools also provides the subcommand bed12tobed6 for conversion of BED12 files to BED6 format -- in the resulting file each bait would appear on multiple lines when spanning one or more splice junctions.

Gene Metadata File

A TSV file that lists all the genes included in the final panel design along with additional metadata.

The following is a portion of an example gene metadata file:

ensembl_id      gene_name       alternate_symbols       synonyms        description     mappability_flag    total_baits
ENSG00000121410 A1BG    -       A1B;ABG;GAB;HYST2477    alpha-1-B glycoprotein  FALSE   29
ENSG00000268895 A1BG-AS1        -       A1BG-AS;A1BGAS;NCRNA00181       A1BG antisense RNA 1    FALSE  27
ENSG00000148584 A1CF    -       ACF;ACF64;ACF65;APOBEC1CF;ASP   APOBEC1 complementation factor  FALSE  77
ENSG00000175899 A2M     -       A2MD;CPAMD5;FWP007;S863-7       alpha-2-macroglobulin   FALSE  50
ENSG00000245105 A2M-AS1 -       -       A2M antisense RNA 1     FALSE  24

Columns contain the following information:

Column NameDescription
ensembl_id Ensembl ID for gene as used in the cellranger reference.
gene_name Gene symbol/name for this gene as used in the cellranger reference.
alternate_symbols Semi-colon separated list of alternate gene symbols provided by NCBI or - if there are none provided or not applicable. These are different from synonyms in that they are annotated as official symbols by NCBI, whereas synonyms are largely not.
synonyms Semi-colon separated list of synonyms provided by NCBI or - if none provided by NCBI or not applicable.
description Long-form description of this gene as provided by Ensembl or - if not applicable.
mappability_flag TRUE/FALSE to indicate whether or not a gene has low mappability or low bait coverage near transcript ends due to filtering of repetitive sequences.
total_baits The number of baits designed to target this gene.

Files downloaded from 10x Genomics Custom Panel Designer for customized panels will additionally include the following columns:

Column NameDescription
user_input Original gene name entered by user or - if not applicable (e.g. a gene that was already on a predesigned panel when creating an add-on design).

Downloads for Custom Panels Only

Bait Design File

Use this Excel (.xlsx) file to order your custom or add-on panel from a compatible oligo provider. It contains the bait sequences for add-on genes and custom sequences in a commonly accepted format.

Custom Sequences GTF/FASTA Files

For custom panels containing custom sequence designs, a FASTA and GTF file are automatically generated and available for download. These files can be used with Cell Ranger’s mkref command to make custom references (see documentation).