Hi all, I need to detect DNA sampling bias. We extracted DNA using two different library preps and then multiplexed the samples.
I think that sampling bias will basically be something where each read will be more likely to start with a certain GC content or sub sequence. Therefore I think the best way to analyze any bias will be to assemble with an assembly output file (in my case, I'll do velvet, with the output afg file). After that, I am not exactly sure what to do. I could look places with low depth, e.g., at contig breaks, and see what the first 10 bp are at each place.
Another way might be to look at the first 10 bp at each read and see if there is any bias too. Gibbs sampling, %GC, some kind of kmer analysis?
But anyway, what is a good standard or common-sense way to do this? Thank you BioStar community!
as well as nucleotide distributions along each position of the reads in your sample ...