Hi.
I'm trying to concatenate 4 files into one. This is how my raw data looks like:
> S9_L001_R1_001_1P.fq.gz
> S9_L001_R1_001_1U.fq.gz
> S9_L001_R1_001_2P.fq.gz
> S9_L001_R1_001_2U.fq.gz
> S10_L001_R1_001_1P.fq.gz
> S10_L001_R1_001_1U.fq.gz
> S10_L001_R1_001_2P.fq.gz
> S10_L001_R1_001_2U.fq.gz
I have twenty samples (S1-20) and all samples consist of four files (1P, 1U, 2P and 2U). The code I've come up with but that doesn't work looks like this:
for i in {1..20};
do for j in 1 2;
do cat S${i}_L001_R1_001_${j}*.fq.gz >S${i}_concatenate.fq.gz; done; done
It only concatenates any 2 files from each sample.
Any suggestions? Thanks.
I hope there is a reason you are trying to cat these together. Based on the names it looks like these are properly paired and unpaired reads after trimming.
You code is ignoring the
1P, 1U, 2P and 2U
in names. What order do you want to concatenate those pieces in?Since I'm not mapping the reads to a reference genome or building my own, I figured I might as well treat them as single end reads.
I don't think it matters what order I map them in, but I guess 1P-1U-2P-2U