Question

Wgsim Mutations In Output After Setting Everything To 0

1

Entering edit mode

12.3 years ago

darxsys ▴ 240

I was just wondering, is there any useful information on wgsim? Tutorial? Anything? I have been stuck with it for the last 2 weeks. I'm really not sure how to use it. I need it for a project of mine. For example, I downloaded a genome from NCBI. What I do is call wgsim like this:

./wgsim -e 0 -s 0 -N 1000 -1 30 -2 30 -r 0 -R 0 -X 0 -A 0 test_genome_one_row.fa read1.fa read2.fa

With this, I would expect that all reads would be the same as the parts of the genome since I set all its error parameters to 0. But somehow, I get reads with mutations(or something else, because they don't belong in the original genome.) What is going on in here and can somebody please explain wgsim's arguments and how can I really control its behaviour? Thanks!

paired-end paired-end • 3.4k views

ADD COMMENT • link updated 12.3 years ago by Istvan Albert 102k • written 12.3 years ago by darxsys ▴ 240

score 1 · Answer 1 · 2013-04-05

1

Entering edit mode

12.3 years ago

Istvan Albert 102k

When you simulate with wgsim the read name will contain the genomic coordinates that were used to produce the read. Check that for the origin.

Note that the program is distributed both with samtools but can be also obtained separately from its own repository:

https://github.com/lh3/wgsim

This latter contains more features and is more up to date.

ADD COMMENT • link 12.3 years ago by Istvan Albert 102k

0

Entering edit mode

Yes, I'm checking that with a Python script which, for some genome, tells me that 450/1000 reads are with mutations (don't belong in the original string). Yes, I obtained it from the repo. I'm just confused. Thanks for help!

ADD REPLY • link 12.3 years ago by darxsys ▴ 240

0

Entering edit mode

you can easily verify reads by mapping them agains the same gemome, the alignment will tell you where and and how it aligns and usually matches the name exactly,

ADD REPLY • link 12.3 years ago by Istvan Albert 102k

0

Entering edit mode

I know. I just wanted to save time analysing alignments. I mean, if I tell wgsim: I want all errors to be zero, than a simple python script can just verify if that read is a part of the input string or not, which I'm doing right now. I'm basically without any use of this program if I can't make it behave the way I need it.

ADD REPLY • link 12.3 years ago by darxsys ▴ 240