Hi everyone,
I have a query regarding merging paired-end read files. I am using FLASH for merging data. I ran flash as follows:
./flash sample_rep1_R1.fastq sample_rep1_R2.fastq -m 5 -t 5 -o sample_merge 2>&1 | tee flash.log
In sample_merge.extendedFrags.fastq I noticed some lines with multiple @ and quality score. For example,
> > <B/B-:@@D@D@:-:@-D@-:@D@@D-DDD@D@:@---D--::D:DDD BBBBBFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFDDDDDDDDDDDDD#::DDDDDDDD@@DDDDDDDDDDDDDDDDDDDDDDDDF#FFFFFFFFFFFFFFFFF<<
> BBBBBF/BFFFBFFFFFDDDD:DDDDDDDDDDDDDDDDD@D:D@DDDDDDD:@D::@@DD@D-D@DDDDDFDB@FFFDF@:::
> BBBBBFFFFFDDD-D@DDDD@-D@-:D:@DDDDD-@DDDDDDD@:D@D-@D-D-@-D-5D@D@FFFFFFFFFF<<<-:7:
> BB@@@DF<FFF<DDDDDDDDDDFDDDDFFFFFDFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF<DDDDDDFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFBB
> BBBBBFFFFBFFFFB/FFFFFFFF<FFFFFFFFFFFFFFFFFFFFFB<FBFFFBFFBFBBFFFFBD@DDDDDDDD@D#::
> BB@@@DDDDDDDDDD@DDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDFFFFF#FFFFFFFF#F#FFFFFFDDF@DD@D@DDDDDDDDDDDDDDDDDDDDDDDDDDDD@DDDDDDDDDDDDFBB
While the unmerged files: sample.notCombined_1.fastq and sample.notCombined_2.fastq does not have these lines.
I am wondering if these multi @ lines in extendedFrags.fastq are normal or are related to the parameter I have chosen.
My reads are 125X2
It would very help if someone can guide me.
Thanks
Ankit
Hi, Thanks for the reply. Yes you are right. I checked the distribution of fastq reads both in *extended.frags.fastq and *.notCombined_1 _2.fastq. It matches the sum of the original fastq. I was doing the mistake by checking read count using "@" and it was not matching sum properly so I thought this might be an issue. But now I checked "@header". It seems ok.
Can you also suggest me the appropriate value for -m and -M for 125 bp read. I am using -m 5 and -M default (65)?
Thanks for the quick help.