Hi all,
I used bamToFastq to convert my bamfiles into R1 and R2 fastq files. I tried samtools fastq before but found it was erroneous (so many missing sequences). However, I still have an issue with bamTofastq from bedtools. When I add the read counts obtained from grep -c "@" R1.fastq
and grep -c "@" R2.fastq
, it is always slightly less than the count from the bamfile samtools view -c in.bamfile
. Why might this be the case? I haven't found any documentation to suggest that this is a normal thing. The R1 and R2 fastq counts should equal the counts in the bamfile so what am I doing wrong with the conversion??
Thank you.
@
is a valid quality encoding character in FASTQ QUAL lines, so you may want to use the simplerline_count / 4
formula to count number of reads in the FASTQ.Or simply add whatever is next to
@
in yourgrep
. e.g.@A00500
(generally sequencer serial). That will only count line 1 from each fastq record.Yea I tried both ways, same thing. Just less reads in my fastq files than in my bamfiles.