Entering edit mode
3 months ago
Vojtěch
▴
10
Hello,
GSE147528 has 20 samples and they are said to be done in pair-end layout. In my experience (im a newbie in RNAseq analysis), pair-end runs have a typical format (*_R1.fastq
and *_R2.fastq
OR *_1.fastq
and *_2.fastq
). In all of my pipelines, I am working with these names. However, the experiment mentioned above has only one SRR/fastq file per sample, although being pair-end.
- Is there a way to convert one fastq file to typical two separate files, so I don't have to edit my pipelines?
- Is this normal for snRNAseq or is it just this experiment?
Thanks in advance.
Thank you for your answer. If I may, I have two following questions: 1) I am running most of my analysis in R studio in R. I have a script, which downloads my files from ENA browser. Is there some other non-linux way to get these files, or is fast-dump the only go-to way? 2) In my normal situation, where there are 2 fastq files, they both contain cDNA sequences in a FASTA format. One forward, one reverse. Right? And so if you are saying that out of these 3 fastq files, only the 3rd contains cDNA, how is this implemented into any pipeline, which works with classical format _1.fastq _2.fastq?
I am sorry if these questions seem too trivial, but I would like to understand it. I appreciate your time, thank you.
That is one way to get the data. If you don't intend to re-analyze the data you may be able to get count files from that GEO record. But this won't be following your standard script.
Correct for bulk RNAseq that would be the case. These data on the other hand are single nucleus RNAseq, which is a different technology. So you would not be able to use your standard analysis methods with this data. You will need to use single cell data specific packages like
seurat
,STARsolo
etc.Ok, I understand now. Thank you so much for helping me!