HOME  ›   pipelines
If your question is not answered here, please email us at:  ${email.software}

10x Genomics
Chromium Single Cell Multiome ATAC + Gene Exp.

Specifying Input FASTQ Files for cellranger-arc count

The cellranger-arc count pipeline requires ATAC and GEX FASTQ files as input, which typically come from running cellranger-arc mkfastq, a 10x Genomics-aware convenience wrapper for bcl2fastq. However, it is possible to use FASTQ files from other sources, such as Illumina's bcl2fastq or BCL Convert, a published dataset, or bamtofastq. Input FASTQ files must conform to the naming conventions of bcl2fastq and mkfastq for cellranger-arc count to successfully complete. These files are specified using a libraries CSV file and passed to the cellranger-arc count pipeline using the --libraries argument.

There are multiple ways bcl2fastq, bcl-convert> and mkfastq can be invoked, resulting in a wide range of potential file names and locations as output. Since finding the right FASTQ files to process and the right arguments to process those files as desired can be confusing, we will illustrate some common scenarios below.

FASTQs file naming convention

To serve as inputs for Cell Ranger ARC, FASTQ files should conform to the naming conventions of bcl2fastq and mkfastq described below.

GEX FASTQs

[Sample Name]_S1_L00[Lane Number]_[Read Type]_001.fastq.gz

Where Read Type is one of:

ATAC FASTQs

[Sample Name]_S1_L00[Lane Number]_[Read Type]_001.fastq.gz

Where Read Type is one of:

Cell Ranger ARC will also accept ATAC FASTQs in this format:

Common scenarios for GEX FASTQ files

Jump to ATAC FASTQ files

Where are your GEX FASTQ files?

How are your GEX FASTQ files named?

Scenario: My GEX FASTQs are in an output folder from mkfastq or bcl2fastq, in a subdirectory next to Reports and Stats folders, with expected sample name prefixes

How did I get here?

By running cellranger-arc mkfastq with a simple CSV layout file or Illumina Experiment Manager samplesheet, or by running bcl2fastq directly (with an IEM samplesheet) on a flow cell.

If you ran mkfastq on the GEX flow cell

Your files will be in a (MKFASTQ_ID)/outs/fastq_path folder, and the file hierarchy may look similar to this:

MKFASTQ_ID
|-- MAKE_FASTQS_CS
`-- outs
    |-- fastq_path
        |-- HFLC5BBXX
            |-- test_sample1
            |   |-- test_sample1_S1_L001_I1_001.fastq.gz
            |   |-- test_sample1_S1_L001_I2_001.fastq.gz
            |   |-- test_sample1_S1_L001_R1_001.fastq.gz
            |   |-- test_sample1_S1_L001_R2_001.fastq.gz
            |   |-- test_sample1_S1_L002_I1_001.fastq.gz
            |   |-- test_sample1_S1_L002_I2_001.fastq.gz
            |   |-- test_sample1_S1_L002_R1_001.fastq.gz
            |   |-- test_sample1_S1_L002_R2_001.fastq.gz
            |   |-- test_sample1_S1_L003_I1_001.fastq.gz
            |   |-- test_sample1_S1_L003_I2_001.fastq.gz
            |   |-- test_sample1_S1_L003_R1_001.fastq.gz
            |   `-- test_sample1_S1_L003_R2_001.fastq.gz
            |-- test_sample2
            |   |-- test_sample2_S2_L001_I1_001.fastq.gz
            |   |-- test_sample2_S2_L001_I2_001.fastq.gz
            |   |-- test_sample2_S2_L001_R1_001.fastq.gz
            |   |-- test_sample2_S2_L001_R2_001.fastq.gz
            |   |-- test_sample2_S2_L002_I1_001.fastq.gz
            |   |-- test_sample2_S2_L002_I2_001.fastq.gz
            |   |-- test_sample2_S2_L002_R1_001.fastq.gz
            |   |-- test_sample2_S2_L002_R2_001.fastq.gz
            |   |-- test_sample2_S2_L003_I1_001.fastq.gz
            |   |-- test_sample2_S2_L003_I2_001.fastq.gz
            |   |-- test_sample2_S2_L003_R1_001.fastq.gz
            |   `-- test_sample2_S2_L003_R2_001.fastq.gz
        |-- Reports
        |-- Stats
        |-- Undetermined_S0_L001_I1_001.fastq.gz
        ...
        `-- Undetermined_S0_L003_R2_001.fastq.gz

If you ran bcl2fastq directly on the GEX flow cell

Your file hierarchy may look similar to this:

BCL2FASTQ_OUTPUT_DIR
|-- HFLC5BBXX
    |-- test_sample1
    |   |-- test_sample1_S1_L001_I1_001.fastq.gz
    |   |-- test_sample1_S1_L001_I2_001.fastq.gz
    |   |-- test_sample1_S1_L001_R1_001.fastq.gz
    |   |-- test_sample1_S1_L001_R2_001.fastq.gz
    |   |-- test_sample1_S1_L002_I1_001.fastq.gz
    |   |-- test_sample1_S1_L002_I2_001.fastq.gz
    |   |-- test_sample1_S1_L002_R1_001.fastq.gz
    |   |-- test_sample1_S1_L002_R2_001.fastq.gz
    |   |-- test_sample1_S1_L003_I1_001.fastq.gz
    |   |-- test_sample1_S1_L003_I2_001.fastq.gz
    |   |-- test_sample1_S1_L003_R1_001.fastq.gz
    |   `-- test_sample1_S1_L003_R2_001.fastq.gz
    |-- test_sample2
    |   |-- test_sample2_S2_L001_I1_001.fastq.gz
    |   |-- test_sample2_S2_L001_I2_001.fastq.gz
    |   |-- test_sample2_S2_L001_R1_001.fastq.gz
    |   |-- test_sample2_S2_L001_R2_001.fastq.gz
    |   |-- test_sample2_S2_L002_I1_001.fastq.gz
    |   |-- test_sample2_S2_L002_I2_001.fastq.gz
    |   |-- test_sample2_S2_L002_R1_001.fastq.gz
    |   |-- test_sample2_S2_L002_R2_001.fastq.gz
    |   |-- test_sample2_S2_L003_I1_001.fastq.gz
    |   |-- test_sample2_S2_L003_I2_001.fastq.gz
    |   |-- test_sample2_S2_L003_R1_001.fastq.gz
    |   `-- test_sample2_S2_L003_R2_001.fastq.gz
...

You will have one set of fastq files per sample, prefixed with the name of the sample as it appears in the simple CSV layout file or IEM samplesheet.

For more information on the naming conventions, please visit Illumina's support site or refer to the bcl2fastq User Guide. The scenario where your files do not conform to the naming convention is described in a different section later on this page.

The table below describes the line in the libraries CSV file you would use in the corresponding scenario. Be sure to substitute the capitalized text as appropriate. The "All Samples" entries in this table are provided for technical completeness.

SituationLine in libraries CSV
All samples (mkfastq)
fastqs,sample,library_type
/PATH/TO/MKFASTQ_ID/outs/fastq_path,,Gene Expression
...
All samples (mkfastq), multiple flow cells
fastqs,sample,library_type
/PATH/TO/MKFASTQ_FLOWCELL1/outs/fastq_path,,Gene Expression
/PATH/TO/MKFASTQ_FLOWCELL2/outs/fastq_path,,Gene Expression
...
All samples (bcl2fastq direct)
fastqs,sample,library_type
/PATH/TO/BCL2FASTQ_OUTPUT_DIR,,Gene Expression
...
Process test_sample1 (mkfastq)
fastqs,sample,library_type
/PATH/TO/MKFASTQ_ID/outs/fastq_path,test_sample1,Gene Expression
...
Process test_sample1 and test_sample2 as a single merged sample (mkfastq)
fastqs,sample,library_type
/PATH/TO/MKFASTQ_ID/outs/fastq_path,test_sample1,Gene Expression
/PATH/TO/MKFASTQ_ID/outs/fastq_path,test_sample2,Gene Expression
...

 

Scenario: My GEX FASTQs are in an output folder from mkfastq or bcl2fastq, in the same directory as the Reports and Stats folders

How did I get here?

An Illumina Experiment Manager-formatted samplesheet was used with either no entry or a blank entry for the Sample_Project column. Your hierarchy may look similar to this:

fastq_path
|-- Reports
|-- Stats
|-- test_sample_S1_L001_I1_001.fastq.gz
|-- test_sample_S1_L001_I2_001.fastq.gz
|-- test_sample_S1_L001_R1_001.fastq.gz
|-- test_sample_S1_L001_R2_001.fastq.gz
|-- test_sample_S1_L002_I1_001.fastq.gz
|-- test_sample_S1_L002_I2_001.fastq.gz
|-- test_sample_S1_L002_R1_001.fastq.gz
|-- test_sample_S1_L002_R2_001.fastq.gz
|-- test_sample_S1_L003_I1_001.fastq.gz
|-- test_sample_S1_L003_I2_001.fastq.gz
|-- test_sample_S1_L003_R1_001.fastq.gz
|-- test_sample_S1_L003_R2_001.fastq.gz
|-- Undetermined_S0_L001_I1_001.fastq.gz
...
`-- Undetermined_S0_L003_R2_001.fastq.gz

This is fine; you would use the same arguments as if the FASTQs were organized into subfolders within the output folder.

SituationLine in libraries CSV
All samples (mkfastq)
fastqs,sample,library_type
/PATH/TO/MKFASTQ_ID/outs/fastq_path,,Gene Expression
...
All samples (bcl2fastq direct)
fastqs,sample,library_type
/PATH/TO/BCL2FASTQ_OUTPUT_DIR,,Gene Expression
...
Process test_sample only (mkfastq)
fastqs,sample,library_type
/PATH/TO/MKFASTQ_ID/outs/fastq_path,test_sample,Gene Expression
...

 

Scenario: The GEX FASTQs are named like "MySample_S1_L001_I1_001.fastq.gz". I don't see Reports or Stats anywhere

How did I get here?

It is likely that FASTQ files have been transferred from either a mkfastq or bcl2fastq run into another folder. They still retain the names assigned by bcl2fastq, which is a combination of sample name, sample order, lane, read type, and chunk. Your file hierarchy may look like this:

PROJECT_FOLDER
|-- MySample_S1_L001_I1_001.fastq.gz
|-- MySample_S1_L001_I2_001.fastq.gz
|-- MySample_S1_L001_R1_001.fastq.gz
|-- MySample_S1_L001_R2_001.fastq.gz
|-- MySample_S1_L002_I1_001.fastq.gz
|-- MySample_S1_L002_I2_001.fastq.gz
|-- MySample_S1_L002_R1_001.fastq.gz
|-- MySample_S1_L002_R2_001.fastq.gz

This is fine; since the files are named according to the bcl2fastq standard, you would use the same arguments as if the FASTQs were organized into a flow cell folder or mkfastq output folder.

SituationLine in libraries CSV
All samples
fastqs,sample,library_type
/PATH/TO/PROJECT_FOLDER,,Gene Expression
...
Process MySample only
fastqs,sample,library_type
/PATH/TO/PROJECT_FOLDER,MySample,Gene Expression
...
 

My GEX FASTQs are not named like any of the above examples.

How did I get here?

It is likely that you received files that were processed through a proprietary LIMS system, which employs its own naming conventions.

10x Genomics pipelines require files to be named in the bcl2fastq convention in order to run properly. You will need to determine the corresponding sample and read type for each file, likely by consulting your sequencing core or the individual who demultiplexed your flow cell.

It is highly likely that these files were initially processed with bcl2fastq. Once you track the origin of the file, you will rename the files in the following format:

[Sample Name]_S1_L00[Lane Number]_[Read Type]_001.fastq.gz

Where Read Type is one of:

After the files have been renamed in the specified format, you will use the following arguments:

SituationLine in libraries CSV
All samples
fastqs,sample,library_type
/PATH/TO/PROJECT_FOLDER,,Gene Expression
...
Process SAMPLENAME only
fastqs,sample,library_type
/PATH/TO/PROJECT_FOLDER,SAMPLENAME,Gene Expression
...

Common scenarios for ATAC FASTQ files

Jump to GEX FASTQ files

Where are your ATAC FASTQ files?

How are your ATAC FASTQ files named?

Scenario: My ATAC FASTQs are in an output folder from mkfastq or bcl2fastq, in a subdirectory next to Reports and Stats folders, with expected sample name prefixes

How did I get here?

By running cellranger-arc mkfastq with a simple CSV layout file or Illumina Experiment Manager samplesheet, or by running bcl2fastq directly (with an IEM samplesheet) on a flow cell.

If you ran mkfastq on the ATAC flow cell

Your files will be in a (MKFASTQ_ID)/outs/fastq_path folder, and your file hierarchy may look similar to this:

MKFASTQ_ID
|-- MAKE_FASTQS_CS
`-- outs
    |-- fastq_path
        |-- HFLC5BBXX
            |-- test_sample1
            |   |-- test_sample1_S1_L001_I1_001.fastq.gz
            |   |-- test_sample1_S1_L001_R1_001.fastq.gz
            |   |-- test_sample1_S1_L001_R2_001.fastq.gz
            |   |-- test_sample1_S1_L001_R3_001.fastq.gz
            |   |-- test_sample1_S1_L002_I1_001.fastq.gz
            |   |-- test_sample1_S1_L002_R1_001.fastq.gz
            |   |-- test_sample1_S1_L002_R2_001.fastq.gz
            |   |-- test_sample1_S1_L002_R3_001.fastq.gz
            |   |-- test_sample1_S1_L003_I1_001.fastq.gz
            |   |-- test_sample1_S1_L003_R1_001.fastq.gz
            |   |-- test_sample1_S1_L003_R2_001.fastq.gz
            |   `-- test_sample1_S1_L003_R3_001.fastq.gz
            |-- test_sample2
            |   |-- test_sample2_S1_L001_I1_001.fastq.gz
            |   |-- test_sample2_S1_L001_R1_001.fastq.gz
            |   |-- test_sample2_S1_L001_R2_001.fastq.gz
            |   |-- test_sample2_S1_L001_R3_001.fastq.gz
            |   |-- test_sample2_S1_L002_I1_001.fastq.gz
            |   |-- test_sample2_S1_L002_R1_001.fastq.gz
            |   |-- test_sample2_S1_L002_R2_001.fastq.gz
            |   |-- test_sample2_S1_L002_R3_001.fastq.gz
            |   |-- test_sample2_S1_L003_I1_001.fastq.gz
            |   |-- test_sample2_S1_L003_R1_001.fastq.gz
            |   |-- test_sample2_S1_L003_R2_001.fastq.gz
            |   `-- test_sample2_S1_L003_R3_001.fastq.gz
        |-- Reports
        |-- Stats
        |-- Undetermined_S0_L001_I1_001.fastq.gz
        ...
        `-- Undetermined_S0_L003_R3_001.fastq.gz

If you ran bcl2fastq directly on the ATAC flow cell

Your file hierarchy may look similar to this:

BCL2FASTQ_OUTPUT_DIR
|-- HFLC5BBXX
    |-- test_sample1
    |   |-- test_sample1_S1_L001_I1_001.fastq.gz
    |   |-- test_sample1_S1_L001_R1_001.fastq.gz
    |   |-- test_sample1_S1_L001_R2_001.fastq.gz
    |   |-- test_sample1_S1_L001_R3_001.fastq.gz
    |   |-- test_sample1_S1_L002_I1_001.fastq.gz
    |   |-- test_sample1_S1_L002_R1_001.fastq.gz
    |   |-- test_sample1_S1_L002_R2_001.fastq.gz
    |   |-- test_sample1_S1_L002_R3_001.fastq.gz
    |   |-- test_sample1_S1_L003_I1_001.fastq.gz
    |   |-- test_sample1_S1_L003_R1_001.fastq.gz
    |   |-- test_sample1_S1_L003_R2_001.fastq.gz
    |   `-- test_sample1_S1_L003_R3_001.fastq.gz
    |-- test_sample2
    |   |-- test_sample2_S1_L001_I1_001.fastq.gz
    |   |-- test_sample2_S1_L001_R1_001.fastq.gz
    |   |-- test_sample2_S1_L001_R2_001.fastq.gz
    |   |-- test_sample2_S1_L001_R3_001.fastq.gz
    |   |-- test_sample2_S1_L002_I1_001.fastq.gz
    |   |-- test_sample2_S1_L002_R1_001.fastq.gz
    |   |-- test_sample2_S1_L002_R2_001.fastq.gz
    |   |-- test_sample2_S1_L002_R3_001.fastq.gz
    |   |-- test_sample2_S1_L003_I1_001.fastq.gz
    |   |-- test_sample2_S1_L003_R1_001.fastq.gz
    |   |-- test_sample2_S1_L003_R2_001.fastq.gz
    |   `-- test_sample2_S1_L003_R3_001.fastq.gz
...

You will have one set of fastq files per sample, prefixed with the name of the sample as it appears in the simple CSV layout file or IEM samplesheet. Other situations described later on this page deal with the presence of four separate sets of files (four "samples" from bcl2fastq's point of view) per single biological sample/library.

For more information on the naming conventions, please visit Illumina's support site or refer to the bcl2fastq User Guide. The scenario where your files do not conform to the naming convention is described in a different section later on this page.

The table below describes the line in the libraries CSV file you would use in the corresponding scenario. Be sure to substitute the capitalized text as appropriate. The "All Samples" entries in this table are provided for technical completeness.

SituationLine in libraries CSV
All samples (mkfastq)
fastqs,sample,library_type
/PATH/TO/MKFASTQ_ID/outs/fastq_path,,Chromatin Accessibility
...
All samples (mkfastq), multiple flow cells
fastqs,sample,library_type
/PATH/TO/MKFASTQ_FLOWCELL1/outs/fastq_path,,Chromatin Accessibility
/PATH/TO/MKFASTQ_FLOWCELL2/outs/fastq_path,,Chromatin Accessibility
...
All samples (bcl2fastq direct)
fastqs,sample,library_type
/PATH/TO/BCL2FASTQ_OUTPUT_DIR,,Chromatin Accessibility
...
Process test_sample1 (mkfastq)
fastqs,sample,library_type
/PATH/TO/MKFASTQ_ID/outs/fastq_path,test_sample1,Chromatin Accessibility
...
Process test_sample1 and test_sample2 as a single merged sample (mkfastq)
fastqs,sample,library_type
/PATH/TO/MKFASTQ_ID/outs/fastq_path,test_sample1,Chromatin Accessibility
/PATH/TO/MKFASTQ_ID/outs/fastq_path,test_sample2,Chromatin Accessibility
...

 

Scenario: My ATAC FASTQs are in an output folder from mkfastq or bcl2fastq, but there are multiple folders per sample index, like "SI-GA-A1_1" and "SI-GA-A1_2"

How did I get here?

It is likely that the input samplesheet used explicitly separated the four oligos in a 10x Genomics sample index set into four separate sample names. You may see a file hierarchy similar to this:

bcl2fastq_output
|-- HFLC5BBXX
    |-- SI-GA-A1_1
    |   |-- SI-GA-A1_1_S1_L001_I1_001.fastq.gz
    |   |-- SI-GA-A1_1_S1_L001_R1_001.fastq.gz
    |   |-- SI-GA-A1_1_S1_L001_R2_001.fastq.gz
    |   `-- SI-GA-A1_1_S1_L001_R3_001.fastq.gz
    |-- SI-GA-A1_2
    |   |-- SI-GA-A1_2_S2_L001_I1_001.fastq.gz
    |   |-- SI-GA-A1_2_S2_L001_R1_001.fastq.gz
    |   |-- SI-GA-A1_2_S2_L001_R2_001.fastq.gz
    |   `-- SI-GA-A1_2_S2_L001_R3_001.fastq.gz
    |-- SI-GA-A1_3
    |   |-- SI-GA-A1_3_S3_L001_I1_001.fastq.gz
    |   |-- SI-GA-A1_3_S3_L001_R1_001.fastq.gz
    |   |-- SI-GA-A1_3_S3_L001_R2_001.fastq.gz
    |   `-- SI-GA-A1_3_S3_L001_R3_001.fastq.gz
    |-- SI-GA-A1_4
    |   |-- SI-GA-A1_4_S4_L001_I1_001.fastq.gz
    |   |-- SI-GA-A1_4_S4_L001_R1_001.fastq.gz
    |   |-- SI-GA-A1_4_S4_L001_R2_001.fastq.gz
    |   `-- SI-GA-A1_4_S4_L001_R3_001.fastq.gz
|-- Reports
|-- Stats
|-- Undetermined_S0_L001_I1_001.fastq.gz
|-- Undetermined_S0_L001_R1_001.fastq.gz
|-- Undetermined_S0_L001_R2_001.fastq.gz
`-- Undetermined_S0_L001_R3_001.fastq.gz

You probably want to be able to merge All samples from the SI-GA-A1 index into a single analysis. If you only run one index at a time, you will see a smaller number of reads than expected, which may translate to lower than expected coverage or cell count for the experiment.

SituationLine in libraries CSV
All samples (mkfastq)
fastqs,sample,library_type
/PATH/TO/MKFASTQ_ID/outs/fastq_path,,Chromatin Accessibility
...
Process all SI-GA-A1 reads in a single analysis
fastqs,sample,library_type
/PATH/TO/MKFASTQ_ID/outs/fastq_path,SI-GA-A1_1,Chromatin Accessibility
/PATH/TO/MKFASTQ_ID/outs/fastq_path,SI-GA-A1_2,Chromatin Accessibility
/PATH/TO/MKFASTQ_ID/outs/fastq_path,SI-GA-A1_3,Chromatin Accessibility
/PATH/TO/MKFASTQ_ID/outs/fastq_path,SI-GA-A1_4,Chromatin Accessibility
...
Only process first sample index
fastqs,sample,library_type
/PATH/TO/MKFASTQ_ID/outs/fastq_path,SI-GA-A1_1,Chromatin Accessibility
...

 

Scenario: My ATAC FASTQs are in an output folder from mkfastq or bcl2fastq, in the same directory as the Reports and Stats folders

How did I get here?

An Illumina Experiment Manager-formatted samplesheet was used with either no entry or a blank entry for the Sample_Project column. Your hierarchy may look similar to this:

fastq_path
|-- Reports
|-- Stats
|-- test_sample_S1_L001_I1_001.fastq.gz
|-- test_sample_S1_L001_R1_001.fastq.gz
|-- test_sample_S1_L001_R2_001.fastq.gz
|-- test_sample_S1_L001_R3_001.fastq.gz
|-- test_sample_S1_L002_I1_001.fastq.gz
|-- test_sample_S1_L002_R1_001.fastq.gz
|-- test_sample_S1_L002_R2_001.fastq.gz
|-- test_sample_S1_L002_R3_001.fastq.gz
|-- test_sample_S1_L003_I1_001.fastq.gz
|-- test_sample_S1_L003_R1_001.fastq.gz
|-- test_sample_S1_L003_R2_001.fastq.gz
|-- test_sample_S1_L003_R3_001.fastq.gz
|-- Undetermined_S0_L001_I1_001.fastq.gz
...
`-- Undetermined_S0_L003_R3_001.fastq.gz

This is fine; you would use the same arguments as if the FASTQs were organized into subfolders within the output folder.

SituationLine in libraries CSV
All samples (mkfastq)
fastqs,sample,library_type
/PATH/TO/MKFASTQ_ID/outs/fastq_path,,Chromatin Accessibility
...
All samples (bcl2fastq direct)
fastqs,sample,library_type
/PATH/TO/BCL2FASTQ_OUTPUT_DIR,,Chromatin Accessibility
...
Process test_sample only (mkfastq)
fastqs,sample,library_type
/PATH/TO/MKFASTQ_ID/outs/fastq_path,test_sample,Chromatin Accessibility
...

 

Scenario: The ATAC FASTQs are named like "MySample_S1_L001_I1_001.fastq.gz". I don't see Reports or Stats anywhere

How did I get here?

It is likely that FASTQ files have been transferred from either a mkfastq or bcl2fastq run into another folder. They still retain the names assigned by bcl2fastq, which is a combination of sample name, sample order, lane, read type, and chunk. Your file hierarchy may look similar to this:

PROJECT_FOLDER
|-- MySample_S1_L001_I1_001.fastq.gz
|-- MySample_S1_L001_I2_001.fastq.gz
|-- MySample_S1_L001_R1_001.fastq.gz
|-- MySample_S1_L001_R2_001.fastq.gz
|-- MySample_S1_L002_I1_001.fastq.gz
|-- MySample_S1_L002_I2_001.fastq.gz
|-- MySample_S1_L002_R1_001.fastq.gz
|-- MySample_S1_L002_R2_001.fastq.gz

This is fine; since the files are named according to the bcl2fastq standard, you would use the same arguments as if the FASTQs were organized into a flow cell folder or mkfastq output folder.

SituationLine in libraries CSV
All samples
fastqs,sample,library_type
/PATH/TO/PROJECT_FOLDER,,Chromatin Accessibility
...
Process MySample only
fastqs,sample,library_type
/PATH/TO/PROJECT_FOLDER,MySample,Chromatin Accessibility
...
 

My ATAC FASTQs are not named like any of the above examples

How did I get here?

It is likely that you received files that were processed through a proprietary LIMS system, which employs its own naming conventions.

10x Genomics pipelines require files to be named in the bcl2fastq convention in order to run properly. You will need to determine the corresponding sample and read type for each file, likely by consulting your sequencing core or the individual who demultiplexed your flow cell.

It is highly likely that these files were initially processed with bcl2fastq, so you will need to rename the files in one of the following formats, once you track down their origin:

[Sample Name]_S1_L00[Lane Number]_[Read Type]_001.fastq.gz

Where Read Type is one of:

Alternatively, Cell Ranger ARC will also accept ATAC FASTQs in this format:

After you have renamed those files into that format, you'll use the following arguments:

SituationLine in libraries CSV
All samples
fastqs,sample,library_type
/PATH/TO/PROJECT_FOLDER,,Chromatin Accessibility
...
Process SAMPLENAME only
fastqs,sample,library_type
/PATH/TO/PROJECT_FOLDER,SAMPLENAME,Chromatin Accessibility
...