As our lab is now sequencing some plant genome and I want to get some data to practice assemble first with SOAPdenovo or ALLPATHs-lg. But I don't know where I can download sequencing data for this, any suggestions? Thanks very much?
As our lab is now sequencing some plant genome and I want to get some data to practice assemble first with SOAPdenovo or ALLPATHs-lg. But I don't know where I can download sequencing data for this, any suggestions? Thanks very much?
You can get fastq files from 1000Genomes. It's human data, not plant, but for the purposes you mention data are data.
Lots of data at the NCBI sequence read archive. You can convert to FASTQ format using their SRA toolkit.
If you want your assembled reads to be comparable to anything, have a look at the data from the genome assembly-competitions Assemblathon 1, 2 and GAGE:
Assemblathon 1 data
Assemblathon 2 data
GAGE-read-data
When you finished your genome assembly you can go back and compare to other groups' assemblies to see whether you've used all the parameters correctly and how the assemblers you've used compare to others'.
Sadly, there doesn't seem to be any plant-data involved in these competitions, and especially plant-genomes are, due to a high amount of repitions, whole genome duplications and other complications more harder to assemble than your "standard" E. coli-genome.
Use of this site constitutes acceptance of our User Agreement and Privacy Policy.