Hello
As part of my bioinformatic internship, one of my objectif is to perform functional annotation on single cell sequencing data, but I'm a little lost ..
Till now, I have assembly my reads (from differents SAGs) into contigs with SPADes (after doing quality control), I have use Prodigal to predict ORF on these contigs and I have BLAST these ORF against Eggnog database.
In the tabulated output file generate by BLAST, my idea was to collect accession name from Eggnog, for exemple:
733_F2_>NODE_3161_length_96_cov_1_ID_6321_1 13616.ENSMODP00000006300 100.0 31 0 0 1 31 54 84 4.1e-09 65.1
and for each of these accessions name retrieve the biological process, molecular function etc to do some graphs using R by couting the number of occurrences in each category
But I can't find file to make the link between the accessions name and the gene ontology ..
My idea is maybe bad ? I'm a newbie in bioinformatics and I don't know how to process functional annotation, maybe I'm not at all on the right track
What would you do in my position?
Thank's in advance and sorry for my english
You may consider using PANTHER or DAVID in order to find your GO terms for your accession names.
I will try, thank you
It's only via web interface or it can use on commande-line ?
you can also consider using tools like : blast2GO or interproscan
something like Trapid can also be helpful I think
a very crude approach would to to simply blast your sequences against uniprot/swissprot and 'transfer' the description of the best hit