How can I create a ciclo for in bash to merge each R1 fastq and R2 fastq in a single-end file?
1
0
Entering edit mode
2.1 years ago

Good day everyone,

I have this file set:

1_S1_R1.fastq.gz
1_S1_R2.fastq.gz
2_S2_R1.fastq.gz
2_S2_R2.fastq.gz
3_S3_R1.fastq.gz
3_S3_R2.fastq.gz

I would want create a ciclo for with the software SeqPrep in which each couple of reads file will merge together; for example 1_S1_R1.fastq.gz + 1_S1_R2.fastq.gz = 1_S1_merged.fastq.gz and so on for all the files in the directory. Is this possible? Or am I just imagine?

Best regards
Lorenzo

NGS genomics • 955 views
ADD COMMENT
2
Entering edit mode

Pure fantasy of a bioinformatic world that does not exist. You want to basically merge paired-end data into single-end, this is probably a crime in bioinformatics.

ADD REPLY
0
Entering edit mode

@OP, do you mean an interleaved file, so for each read you have alternative R1/R2/R1/R2... in the same file, or do you simply want to take entire R1 file and concatenate R2 to it (the latter would be, as said, questionable)?

ADD REPLY
0
Entering edit mode
2.1 years ago
GenoMax 147k

ciclo = loop?

Merging can mean different things. What you are asking for 1_S1_R1.fastq.gz + 1_S1_R2.fastq.gz = 1_S1_merged.fastq.gz is concatenation of the files end to end. Very few packages can use data in this format.

More common usage would be to either interleave R1/R2 data files (described here Combine paired-end fastq files ) or to actually merge individual reads to create a longer single read representation (if the reads overlap in the middle, when number of cycles of sequencing is > (insert size/2)) using a program like bbmerge.sh or FLASH Insert size provided by bbmerge

ADD COMMENT

Login before adding your answer.

Traffic: 2431 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6