Entering edit mode
7.6 years ago
Phoe
▴
20
Hello, I'm wondering if there are any recommendations of DNA sequence simulated tools like "dwgsim", which can output the vcf file that tells us at which site is with artificial variant?
Thank you!
Can you define "artificial variant"?
Basically, I would like the artificial variants to be SNPs. For example, If I want to make the divergence rate of the reads around 1%, each read is 80-150 bp, according to the input file (or reference sequence), there would be 1% of the SNPs site among each read. I hope the DNA sequence simulators could provide the "true location report of the SNPs" it makes.
You could use BBMap's randomreads.sh tool. 100% of the variants it induces are artificial, so it's pretty easy to figure out which is which :) All of them come from simulated read error, and none of them come from simulated genomic mutations.
Thank you, Brian. :)