Entering edit mode
5.8 years ago
Picasa
▴
650
Hi,
Has anybody already submitted a genome in EBI ? For the scaffold assembly they ask for a fasta file and an AGP file.
see this link: https://ena-docs.readthedocs.io/en/latest/cli_02.html
I just have the fasta (assembly) but how to generate this AGP ?
Thanks a lot.
In that case the assembly you will submit must be at the contig level. The agp file is the receipt to create the scaffolds from the contigs. The agp file should have been created by the tool used for the scaffolding.
Thanks Juke-34.
But that's not practical ? I want to allow user to download the final genome (at scaffold level) and not contigs + agp.
Based on the contigs and the AGP file, ENA will create the scaffold assembly that will be released for download. I don’t remember if they provide the agp file for download or the contigs.
Hi Juke-34,
Sorry for reviving the thread. I also got the same problem; I lost the associated files (.paths and .gfa) and I only have a fasta genome (assembled to scaffolds with spades), which I have already processed (removed highly repetitive scaffolds, mitochondrial genomes, etc) and used it for annotation. I wasn't aware about the need of AGP file until recently. Now I am thinking about using fasta2agp.pl (from velvet) to create an agp file for my genome. Perhaps I am wrong, but I understand that the script can generate an AGP file from a fasta genome without the need of assembly graph and scaffold paths. Do you have experience with it?
Another option is just like you said -- submit it at contig level. But for this option, I am afraid that it will be showing on EBI as contigs, when it is assembled to scaffolds. :( Do you have a suggestion?
Thank you very much and looking forward to hearing from you.
Using fasta2agp.pl will make a fake agp file describing the current state of the assembly (no change between contig and scaffold). This is cheating but it should be accepted by ENA. There is no other solution except re-doing everything from scratch. If you choose to do this 'fake' agp, think to add lot of detail in the description of metadata to well understand that the contig you submit are not contigs, and it is why they are the same as the scaffolds...
Thanks a lot for your reply! Or on the other hand, I can submit it as contig and provide information there that it was a scaffold. :)
Do you know if there are differences on the final EBI webpage for the genome between submitting as contig or scaffold? I meant if there are no indication in the genome description shown on the page, it doesn't not matter whether I submit it as contig or scaffold?
I guess the difference will be when applying filters to request assemblies. There is probably a way to download contig either scaffold. When asking for contigs in your case it will not be contigs in reality...
The best would be to contact their Helpdesk and describe your case, they will tell you how to proceed.
Thanks! I will contact them. :)