Read Generator With Clear Annotation Of Read Origin And Mutations
2
2
Entering edit mode
13.4 years ago
Travis ★ 2.8k

Hi all,

Does an NGS read simulator exist that will output reads along with their chromosome/coordinates of origin, position and type of mutation/error introduced etc?

I want to assess some aligners and the better annotated the reads in terms of their composition, the better!

short next-gen sequencing simulation alignment • 3.2k views
ADD COMMENT
2
Entering edit mode
13.4 years ago
Jts ★ 1.4k

John St John's SimSeq program outputs the sampled reads as a SAM file, which would provide you with the position and orientation. I don't think it tracks the position of introduced errors but it would not be too difficult to change the code to write this information as an MD tag.

https://github.com/jstjohn/SimSeq

ADD COMMENT
2
Entering edit mode

Well, if such software doesn't exist SimSeq would be a good place to start. Alternatively, you could run samtools calmd on the SimSeq output with the original reference to fill in the MD tag.

ADD REPLY
0
Entering edit mode

My real motive here is to find out whether software exists to perform the task at hand without alteration/rewriting.

ADD REPLY
1
Entering edit mode
13.4 years ago

The samtools package contains a utility named wgsim

Program: wgsim (short read simulator)
Version: 0.2.3
Contact: Heng Li <lh3@sanger.ac.uk>

Usage:   wgsim [options] <in.ref.fa> <out.read1.fq> <out.read2.fq>
(...)

It generates some short reads and a pileup file containing the mutations.

ADD COMMENT
0
Entering edit mode

Although it does not appear to give the mapping location of the reads or associate the read with the mutations

ADD REPLY
0
Entering edit mode

Although it does not appear to associate the read with the mutations

ADD REPLY

Login before adding your answer.

Traffic: 2063 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6