I downloaded the SRA file from NCBI for my organism of interest. It is Illumina sequenced paired end RNA data. Normally to create an assembly forward and reverse reads are required by Trinity. However the downloaded file has no separate fprward and reverse reads. It appears to be a merged file. The _1 and _2 suffixes suggest that.
I wonder how can I split the forward and reverse reads if it is merged. Is there any other way to get such data.
Thanks
How did you process the SRA file? Did you use the
--split-files
and-F
options withfastq-dump
to split the two read files and recover original Illumina fastq headers? Post the SRA # if you want someone to check on it.I have not processed it yet. I just downloaded the SRA file through galaxy. On viewing the file it looks like this
If you are limited to working in
Galaxy
then I don't know the option you should use off the top of my head but make sure to choose split-files if that is available. Otherwise this is simple to take care of usingreformat.sh
from BBMap suite but that will have to be done on the command line.reformat.sh in=SRA.fq out1=R1.fq out2=R2.fq