I am debating whether the following is good practice and could do with guidance/ consensus.
I have aligned mRNA-seq reads to a genome published approx. 3 years ago. From my understanding, they carried out a BlastP on the predicted proteins to annotate them with a function/ name. A lot where hypothetical/ no hit etc. Not much use.
Given that the proteins were predicted and annotated over 3 years ago, is it advisable to re-annotate these sequences to the most up to date UniprotKB release? I would assume this is common practice as people always want to be working with more / updated resources of information?..OR should I leave it up to the original authors of the genome to do this? What is generally accepted?
I have mRNA gene sequences as well as the original predicted proteins available in a .fasta file which I can use to re-annotate against UniprotKB (Swiss and Trembl). Does it matter if I choose to use mRNA over the predicted proteins as BlastX will search in all 6 coding frames anyway? Does BlastP on predicted proteins yield better results?
Thanks for the insight / consensus
Has the genome been incorporated into NCBI or Ensembl? If yes, there should / could be a better annotation available, compared to the authors original annotation
h.mon, yes it's available on NCBI. In what way would it be 'better'?
Also, how would I go about downloading that data?
Thanks for the insight.
What is the species? Go to https://www.ncbi.nlm.nih.gov/ , then at the search box type the name of the species, select
Genome
database at the pull-down menu, and hit enter.