Simulate Sff File
4
5
Entering edit mode
14.0 years ago
Lee Katz ★ 3.2k

Hi, I am giving a workshop of genome assembly and I would like to have the students try genome assembly for themselves. However it will not be feasible to have tens of students performing assembly on a genome on the order of megabases. This is because it will likely be on either one server or on desktop computers, and there will be a time constraint. Is there a way to simulate an SFF for something smaller like a plasmid? Or simulate an SFF based on a neighborhood of a few operons? Thank you.

assembly simulation • 3.9k views
ADD COMMENT
3
Entering edit mode
14.0 years ago

Rather than simulating an SFF (assuming you mean the 454's Standard Flowgram Format) you might be better off simulating sequences. On that topic there were some answers here: how-to-produce-simulated-synthetic-sequences

ADD COMMENT
0
Entering edit mode

I found a link to a link on that BioStar page, thank you. It shows how to simulate a genome. http://sourceforge.net/apps/mediawiki/dnaa/index.php?title=Whole_Genome_Simulation

ADD REPLY
0
Entering edit mode

Installation required many packages which were not listed in the documentation. After I installed everything, it gave a slew of errors in C, which I cannot debug. I'm not sure if this is the way to go.

ADD REPLY
0
Entering edit mode

MetaSim works.

ADD REPLY
3
Entering edit mode
14.0 years ago
lexnederbragt ★ 1.3k

Have you tried google? You will find at least this one:

Flowsim, http://blog.malde.org/index.php/flowsim/, paper here: http://www.ncbi.nlm.nih.gov/pmc/articles/PMC2935434/

(http://google.com/search?q=454+sff+simulation)

ADD COMMENT
2
Entering edit mode
14.0 years ago

Maybe you could use true data from traces archives, like SRA database (let's say a virus, like this one)? You can download fastq files (not sffs) but as far as I know Newbler can read fasta files with or without quality information (although it's possible that you would need to rescale quality scores in the first case).

ADD COMMENT
2
Entering edit mode
14.0 years ago
Bach ▴ 550

The new NCBI SRA format allows you to download their SRA archives and convert it to any of the more widely vendor formats used (SFF, FASTQ, Illumina) via their SRA Toolkit, see http://www.ncbi.nlm.nih.gov/books/NBK49294/ for download and manual.

So, search for "virus" or "plasmid" in the SRA (perhaps something like http://www.ncbi.nlm.nih.gov/sra/SRX025865?report=full), download the corresponding SRA, convert it to SFF and you're done.

Note 1: the 1.0b10 toolkit has one "error" admonished by current gcc which is quickly fixed. Note 2: using plasmid or virus libraries as example for assembly may be counter productive as these things tend to be really nasty as most of the time it's not one clean DNA which was sequenced but a mixture and that can confuse assemblers quite a lot.

ADD COMMENT
0
Entering edit mode

The NBK link didn't work - do you mean this one? http://www.ncbi.nlm.nih.gov/books/NBK47528/

ADD REPLY

Login before adding your answer.

Traffic: 1961 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6