Entering edit mode
11 months ago
Researcher
▴
20
I am trying to analyze a publicly available data on SRA. For that I have to map FASTQ files to the reference genome. When I did this before, I downloaded the FASTQ files from Google Cloud and could map them to the reference genome using CellRanger. However, here my only option is to downloaded them from EBI or using SRA toolkit but then if I try CellRanger I will get a naming convention error. What other tools are available besides CellRanger to map the reads of single-cell RNA-seq to the reference genome using FASTQ files downloaded from EBI?
alevin-fry and kallisto bustools are popular alternatives.
I second salmon and kallisto. They work directly with fastq files.
Rename the files. Or better, use softlinks. What's the problem with that?
I have tried doing this before, it did not work. I still got a header mismatch error.
Can you show me an example entry where you ran into this error?
Yes, here is the error:
Log message:
Please share the GEO/EBI ID of these FASTQ files.
I am analyzing the scRNA-seq data of PRJNA657088, the SRA accession codes are: SRR12654354, SRR12654355, SRR12654356, SRR12654367, SRR12654378, SRR12654379. I tried to get them from AWS but I have to create a bucket and give permission which I did but I face an error on the "create a data delivery order" on the NCBI website about not giving permissions to the bucket.
After dumping a couple of reads for one of these I don't see any problems with mismatches.
Please also share the exact CellRanger command you're using.
This and
naming convention
are two separate errors.If you are getting a header mismatch then your reads are likely out of sync in R1/R2 files.
I am getting this error message: Log message: FASTQ header mismatch detected at line 4 of input files "fastq/sample-Barcode/sample-Barcode_S4_L001_R1_001.fastq.gz" and "fastq/sample-Barcode/sample-Barcode_S4_L001_R2_001.fastq.gz": file: "fastq/sample-Barcode/sample-Barcode_S4_L001_R1_001.fastq.gz", line: 4 If it is because the reads are out of sync in R1/R2, how can I fix this?
You can use
repair.sh
from BBMap suite to bring the reads back in sync. Here is an example command line: How to resync paired-end data?