Entering edit mode
10.1 years ago
devinliao0918
▴
40
I have some whole genome sequenced data with coverage ~6x and some whole exome sequenced data with coverage ~60x. The sequencing platforms are the same for the above data. However, I found there are more low phred scores in the whole genome sequenced data than that in the whole exome sequenced data. Could anyone tell me whether this is cased by the difference in coverages? Say deep sequencing leads to more high quality reads.
I am talking about the base-call phred scores. All the scores are extracted from pileup files generated by "samtools pileup". I didn't call variants.
Also, the base-quality scores are sort of truncated at 41 because there is none phred score greater than 41 in the WGS data. The wiki page for FASTQ format says the following
Phred scores are unlikely to be affected by WGS vs. exome seq.
I agree with you. I guess the difference may be caused by different technologies, though I know both the WES and WGS are done on Illumina HiSeq 2000.
I agree, this is more likely due to the library prep or difference in the source DNA.