I have raw DNA sequences in multiple files as under.
xxxxxx1.R1.fastq.gz
xxxxxx2.R1.fastq.gz
xxxxxx3.R1.fastq.gz
xxxxxx4.R1.fastq.gz
xxxxxx5.R1.fastq.gz
I can extract the DNA raw reads from a single file and can store it in another file by using the following command.
gunzip -c in.fastq.gz | awk '(NR%4==2)' > out.seq
Is there any way to extract the DNA reads from all the files and save all those DNA reads in a single text file. Instead of doing it one by one.
And I think is it good to do this in python or R instead of basic linux commands ? I guess that pythonic way will not be much efficient.
Umm, why? Why would you want to do this?