Does anyone have experience of generating SRF formatted SOLiD reads for publication?
I know there is solid2srf (http://solidsoftwaretools.com/gf/project/srf/), but I do not know if there is a 'best practice' for generating the files required by sequence archives. There does not seem to be much guidance on what is expected...
I am aiming for Array Express, if that helps. Anyone else submitted sequences here?
There are really no best practices. The srf conversion is the easy part. The most annoying part is creating the metadata to match the specifications required by the recipient.
Thanks for answering. Like you i found the conversion easy, but there is so little documentation i had no idea what metadata was needed. It turned out that Array Express accepts gzip'ed .csfasta and .qual files.
of course it depends on the repository you're considering submitting your results, and maybe this answer won't help you if the repositories you have in mind do not accept these suggestions, but I think that storing data directly on BAM format would be wise because of its reduced size, and also because reads can be recovered and reprocessed if needed (although some may have been left in the mapping step). if all raw results are to be saved I would still go for fastq format (you'll find several solid2fastq implementations from different mapping tools), which also reduces size considerably compared to csfasta+qual files. again sorry if I don't directly answer your question, but I just wanted to leave here a few ideas to be read by anyone landing on this post interested in storing SOLiD data.
the solid2fastq version I've found most useful is the bfast C implementation, which is extremely fast. and if you combine this with pbzip (parallel bzip, which uses all your computer cores available and not just one as gzip does) you will end up reducing the size of your results in a glimpse.
Thanks for answering. Like you i found the conversion easy, but there is so little documentation i had no idea what metadata was needed. It turned out that Array Express accepts gzip'ed .csfasta and .qual files.