Entering edit mode
4.6 years ago
kamanovae
▴
100
Hi! I want to simulate the human genome. I found a suitable BBMap mutate.sh) program. The mutate.sh only outputs reads that contain mutations. But I need to get a fast output file that would contain reads with mutation and without mutations. I want to try to maintain the coverage of the reference genome. How can i do this?
Now I use the command to run:
1.bbmap/mutate.sh in=reference/gh19.fasta out = reference/hg19_with_mut.fasta id = 0.99 prefix = bbmap
I can use the program BBMap randomreads.sh) , but it looks much more difficult to run and I'm afraid to unconsciously get an undesirable result
One does not really simulate a genome but you simulate reads using the reference for that genome.
randomreads.sh
is not difficult to run.Do you need to simulate a genome at this point? There are plenty available in databases. You can run
mutate.sh
on one of them to introduce the mutations, if you don't want to simulate a new dataset.At this stage, I need to introduce mutations in the reference genome, and then I plan to use the NanoSim program to simulate nanopore reads. My final goal is nanopore reads with introduced mutations. Which program is better for the first step?
Then use
mutate.sh
.Take a look at processing parameters to control the mutations.