WGS sequence composition bias at read start

0

Entering edit mode

9 weeks ago

Mega • 0

Hello, I recently received human WGS data, I do not know anything about the library prep or the sequencer used. I checked the fastqs, the per-base sequence quality is good, and there are no over-represented sequences that would indicate the presence of adapters. However I've noticed that the first 12 nucleotides of the reads are not randomly distributed. per base sequence content

A bit of googling led me to this random hexamer priming issue that arises when dealing with RNA-Seq data. Is it the same thing that I am observing here with my WGS data ? And if so, how common is it to use this approach for WGS ?

Thanks

WGS • 348 views

ADD COMMENT • link updated 8 weeks ago by GenoMax 148k • written 9 weeks ago by Mega • 0

1

Entering edit mode

This is likely due to the existence of bias in the tagmentation reaction used in library prep. Should not cause any major issues.

ADD REPLY • link 9 weeks ago by GenoMax 148k

0

Entering edit mode

Thanks @genomax. I am not really worried about the potential consequences, after all they are real sequences, I am just trying to understand where this bias comes from, out of curiosity. So if the issue arises at tagmentation step, is it more likely due to :

enzymes having a sequence preference when cutting fragments ?
adapters struggling to ligate to specific DNA sequence ?

Thanks

ADD REPLY • link 8 weeks ago by Mega • 0

0

Entering edit mode

Sequence preference for the enzymes since we would not know if the latter is happening.

ADD REPLY • link 8 weeks ago by GenoMax 148k

Login before adding your answer.