Hey All, I am trying to do RNA seq on CLC, but before transfer my data to the program I need to split the reads that I have into forwarding and reverse. There is a program called fastq-dump -I --split-files that does this process. The problem which I have is the 75 RNA seq files has it's own SRR number. I tried many times to run the program for all my RNA seq reads but every time it gets failed. Can you please tell me what I should do? Your help is highly appreciated Thanks Mustafa
I already download the files from NCBI. To transfer the 75 RNA seq files to CLC program those files need to be split into reverse and forward. I did use this code fastq-dump -I --split-files SRR390728.sra, which is applicable for example for one SRR390728. My question is how can I split the 75 RNA seq files together without using SRR numbers? Thanks Mustafa
@padwalmk has a potential solution for how to do this.
Sorry I didn't get this part of your answer Just define number of threads available and the output directory. Put this script in your sra file directory and change permission with chmod u+x script. Can you please verify it to me? Thanks Mustafa
@padwalmk provided you with code for a bash script. You will need to put that (in a file say
script.sh
) in the directory where you downloaded all.sra
files. You should also change the number that follows--threads 16
in script to number of cores you have available on your CPU (if you are using a simple computer locally that number may be 4 or 8). You will also need thegnu parallel
program (search for it and install if needed). You can then run the bash script by doing something likebash script.sh
on command prompt.