Entering edit mode
12 months ago
i.sudbery
20k
Does anyone have any good recommendations for an RNA-seq read simulator, that will simulate raw RNA-seq reads from a transcript level quantification, that includes proper modelling of illumina error distributions and runs on the command line or in python. (I know about polyester, which is R based, but this is for an undergraduate student who doesn't know R and doesn't have time to learn).
If necessary I will write a wrapper for Polyester for them to use, but I'd rather I didn't have to take that solution.
Not
python
butrandomreads.sh
from BBMap may fit the bill.Do you know where I might find documation for this? I can't seem to find it on the main bbtools page?
Note that if you want RNA-seq reads from randomreads you should use a transcriptome reference. Furthermore, add the "metagenome" flag, which despite the name, is described in the documentation as:
Note that "metagenome=f" is the default (false) so to enable it you would add "metagenome" or "metagenome=t" which are equivalent.
If you run
randomreads.sh
without any options you will see extensive in-line help.Unforunately I don't see a way to simulate different expression levels for different transcripts in this script.
Can ART not do this?
ART doesn't allow you to simulate different genes having different expression levels.
i.sudbery GPT came up with these options.
http://alumni.cs.ucr.edu/~liw/rnaseqreadsimulator.html
https://github.com/itmat/CAMPAREE
Both allow you to assign values to transcripts.
rnaseqreadsimulator is python 2.7 and looks long unmaintained. CAMPAREE looks more promising, but the documentation is lacking, and neither I nor the student have time to be working out how it works.