cat command to get multiple fq reads into single fastq file
1
0
Entering edit mode
5.0 years ago
Bioinfonext ▴ 470

Hi,

file names are like this:

Root_T1_S_R8_S48_unmapped.R1.fq
Root_T1_S_R9_S47_unmapped.R1.fq
Root_T2_F_R2_S25_unmapped.R2.fq
Root_T2_F_R3_S24_unmapped.R2.fq

I need to take Multiple R1.fq and R2.fq into two separate files to run trinity for de novo assembly.

I am little bit confused, which command is correct? should I use double arrow or single arrow?
cat *.R2.fq >>right.fq
cat *.R2.fq > right.fq

cat *.R1.fq >leftt.fq
cat *.R1.fq >>left.fq

thanks

rna-seq linux • 1.4k views
ADD COMMENT
1
Entering edit mode

The S* numbers are added by Illumina's post-processing software depending on the location of that sample (and its index) in SampleSheet.csv used for demultiplexing. If these are separate samples i.e. if R* has any meaning then you should not be concatenating these files.

If these are replicates of some conditions use as shown in trinity manual :

--samples_file <string>         tab-delimited text file indicating biological replicate relationships.
 #                                   ex.
 #                                        cond_A    cond_A_rep1    A_rep1_left.fq    A_rep1_right.fq
 #                                        cond_A    cond_A_rep2    A_rep2_left.fq    A_rep2_right.fq
 #                                        cond_B    cond_B_rep1    B_rep1_left.fq    B_rep1_right.fq
 #                                        cond_B    cond_B_rep2    B_rep2_left.fq    B_rep2_right.fq
 #

or you could specify multiple files on command line:

 Trinity --seqType fq --max_memory 50G  \
         --left condA_1.fq.gz,condB_1.fq.gz,condC_1.fq.gz \
         --right condA_2.fq.gz,condB_2.fq.gz,condC_2.fq.gz
ADD REPLY
1
Entering edit mode
5.0 years ago
cschu181 ★ 2.8k

Both are fine in your example since you're using cat once for all read files. If you were to loop over the files

for f in $(ls *.R2.fq); do cat $f > right.fq; done

would overwrite right.fq in each iteration, whereas using >> would actually concatenate.

An alternative for the loop would be

for f in $(ls *.R2.fq); do cat $f; done > right.fq

in which case all files would be concatenated.

ADD COMMENT
0
Entering edit mode

thanks you very much!

ADD REPLY

Login before adding your answer.

Traffic: 3613 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6