I have 20 fastq files (paired end reads) and to add a unique number onto the end of the sequence identifier in the fastq files.
So I want this from genome 1:
simulated.2618103/1
To look like this:
simulated.2618103/1.1
I have an awk command that will do the above:
awk '{ if (NR%1==4) gsub("$",".1",$1); print }' in.fq > renamed_in.fq
I want a way to find all the genome 1-10 files and execute the awk command so that each fastq file gets the unique identifier.
So genome 1 should have .1 at the end of its sequence identifier, genome 2 should have .2 at the end of its sequence identifier, etc.
I have tried this:
find . -name "sub_NC_001539*" -exec awk ' { if (NR%4==1) gsub("$", ".1", $1); print } '
The problem isnt the awk command. I just don't know how to get find to pipe correctly to awk and to keep the output as paired end reads
Thanks
Just a modification to the Pierre's answer, as you also need to have the uniqe ID with in fastq,
Thank you. Do you mind explaining the code? I am new to coding and don't quite understand that