Hi,
I want to take I high coverage alignment, like 90x, that has a lot of variation in coverage, and down-sample it to 20x to reduce the variation (i.e., like setting a maximum coverage threshold and cutting all the "peaks" above that coverage in the alignment).
Ideally, the removal of reads or bases from the alignment shouldn't affect heterozygosity too much (i.e., be random).
I've seen similar questions here, but the solutions seem to usually down-sample the average coverage without reducing the heterogeneity in coverage (e.g., samtools -s).
The closest solution I've found is this: Downsampling By Having A Ceiling On Number Of Reads At Each Site but maybe there's a better way or ready-made solution/tool out there?
Thank you,
Bruno Vieira
"maybe there's a better way " , I don't think there's a better way just because I wrote this tool :-P
I'm trying to accomplish the same goal. Pierre, are you saying you have written a tool? That would be great if you have. I looked through your blog and github but nothing jumped out at me as a tool to effect a max depth on a bam. Suggestions? Thanks!
yes sorry, my answer was not clear at all I thought the link was pointing to this link : Capping coverage in bam file + my answer: A: Capping coverage in bam file
Thank you - I look forward to trying this!