Hi.
I would like to find a program that produces simulated (Illumina GAII) reads starting from a fasta file of the genome and a series of parameters so that I can then test a method I am developing. The features (in descending order) are:
- Uses a fasta file as input (or something easily produced from a fasta file)
- Outputs random reads (or simulate as much as possible any known bias of GAII)
- Is written in (Bio)Perl, Python, ISO C/C++ or fully open source platforms.
- Produces errors (like GAII) at known, ideally tunable, rate
- Allows me to specify depth of coverage and length of reads
- Produces qseq or similar as output
- Allows to produce paired end reads
Often methods papers have some analysis of 'synthetic' or simulated data, but they usually don't bother to publish the program to produce such data. I guess I could write a "quick and dirty" program to do it, but I'd rather not reinvent the wheel if there is something available
Thanks. That is just what I was looking for, and I already had it on my computer!