Question

Cellranger count command for single end fastq

1

Entering edit mode

4.6 years ago

sidrah.maryam ▴ 70

Hello everyone, I have single end sequence fastq files that were processed using cellranger. Now I need to align them using cellranger count function. But being single end, the naming convention followed could not be standard. Is there a way to do so? I tried using SRR9320581_S1_L001_R1_001.fastq.gz file convention, but it gave pipestance failed error.

Any help is highly appreciated. Thank you

RNA-Seq sequencing • 4.6k views

ADD COMMENT • link updated 2.4 years ago by Ram 45k • written 4.6 years ago by sidrah.maryam ▴ 70

1

Entering edit mode

cellranger count needs files in a special format. There are three separate files (if you have > 1 sample). R1 file contains the cell barcodes and UMI, I1 file contains Illumina index for the sample, R2 file contains actual reads. It appears that these investigators may have submitted these fastq files in a concatenated format. Try to see if you can separate the reads using --split-files option for fastq-dump.

ADD REPLY • link 4.6 years ago by GenoMax 152k

0

Entering edit mode

okay. I will try that. Thanks a lot.

ADD REPLY • link 4.6 years ago by sidrah.maryam ▴ 70

1

Entering edit mode

I have the same issue, did the split-files solved your problem ?

ADD REPLY • link 4.4 years ago by g.golczer ▴ 10

0

Entering edit mode

Just my two cents here. In my case cellranger count was failing because R1 was in the wrong format. I also had single-end reads, where R1 was supposed to be the cell barcode and UMI, and R2 should be the single-end reads. In my case I had to rerun cellranger mkfastq from the raw BCLs. Note that the I1 file mentioned is optional for cellranger count so don't worry if you don't have it.

In my experience the --split-files option for fastq-dump is only relevant if you're downloading data from SRA, not if you have your raw data in-house.

ADD REPLY • link 2.4 years ago by akh8zm • 0

0

Entering edit mode

In my experience the --split-files option for fastq-dump is only relevant if you're downloading data from SRA, not if you have your raw data in-house.

Please read the question - OP has SRR identifiers, hence their data is from SRA. That is why GenoMax mentioned the --split-files.

ADD REPLY • link 2.4 years ago by Ram 45k