I was just wondering, is there any useful information on wgsim? Tutorial? Anything? I have been stuck with it for the last 2 weeks. I'm really not sure how to use it. I need it for a project of mine. For example, I downloaded a genome from NCBI. What I do is call wgsim like this:
./wgsim -e 0 -s 0 -N 1000 -1 30 -2 30 -r 0 -R 0 -X 0 -A 0 test_genome_one_row.fa read1.fa read2.fa
With this, I would expect that all reads would be the same as the parts of the genome since I set all its error parameters to 0. But somehow, I get reads with mutations(or something else, because they don't belong in the original genome.) What is going on in here and can somebody please explain wgsim's arguments and how can I really control its behaviour? Thanks!
Yes, I'm checking that with a Python script which, for some genome, tells me that 450/1000 reads are with mutations (don't belong in the original string). Yes, I obtained it from the repo. I'm just confused. Thanks for help!
you can easily verify reads by mapping them agains the same gemome, the alignment will tell you where and and how it aligns and usually matches the name exactly,
I know. I just wanted to save time analysing alignments. I mean, if I tell wgsim: I want all errors to be zero, than a simple python script can just verify if that read is a part of the input string or not, which I'm doing right now. I'm basically without any use of this program if I can't make it behave the way I need it.