Hi,
I want to generate reads from a set of different haplotype sequences found at different frequencies within a sample. I was wondering whether any Biostar users had any experience of doing this and which tools they used. I would prefer using an existing tool but there are quite few out there (I have compiled some in the table below). I am wondering which one will suit my purpose best.
Any advice is very welcome - thanks.
Joseph
Name Reference Single- Paired- insert size Quality customise coverage bias Phred score from Simulated sequencing insertion-deletion source platforms
end end between reads score read length existing FASTQ errors errors
ArtificialFastqGenerator doi: 10.1371/journal.pone.0049110 - yes yes yes yes yes based yes yes - https://sourceforge.net/projects/artfast​qgen/ illumina
on GC
ART doi: 10.1093/bioinformatics/btr708 yes yes - no yes - yes yes yes - Roche 454, Illumina Solexa, Applied Biosystem SOLiD
WgSim doi: 10.1093/bioinformatics/btp352 - - - dummy - - - uniform distribution yes by simulating - -
quality error INDEL polymorphisms
scores
Mason Holtgrewe M (2010) Mason - a read yes yes - yes - - no yes random - 454, Illumina, Sanger
simulator for second generation
sequencing data. Technical report,
FU Berlin.
SimSeq Available: https://github.com/jstjohn/SimSeq. yes yes yes - <100bp - - - - - Illumina
Accessed 2012 October 10th.
pIRS doi: 10.1093/bioinformatics/bts187 - yes - empirical model - yes based - empirical model yes with - -
based on on GC based on additional
read cycle read cycle tool
Stampy doi: 10.1101/gr.111120.110 - - - - - - - - - - -
MetaSim doi: 10.1371/journal.pone.0003373 yes - - no - - - - - - -
FlowSim doi: 10.1093/bioinformatics/btq365 yes - - yes - - - - - yes 454
simNGS http://www.ebi.ac.uk/goldman-srv/simNGS/ - - - - - - - - - - -
Grinder doi: 10.1093/nar/gks251 - - - - - - - - - - -
Do not you think that ArtificialFastqGenerator excel over ART?! How long it take you to run ArtificialFastqGenerator? What was the specification of the computer that run it?