How to get Genbank to update wrong annotations?
1
1
Entering edit mode
4.2 years ago
andzsci ▴ 40

I am writing regular error reports for plant sequences that are being annotated as fungi and vice versa to the GenBank email adress for annotation updates. About 3/4 of my email are being ignored and the wrong sequences are still online, although it is obvious they are not correctly annotated. I think this is highy concerning. How can I get GenBank to listen to me? I put a lot of work into trying to fix all the errors I can find.

sequence gene GenBank NCBI • 2.1k views
ADD COMMENT
2
Entering edit mode

Well if I am persistent enough and re-mail the same email a couple of times eventually somebody takes action. (I wait 2 weeks minimum between the reminders.) Here are two examples of sequences I "removed" from GenBank:

https://www.ncbi.nlm.nih.gov/nuccore/KC840062 https://www.ncbi.nlm.nih.gov/nuccore/JF409911

here are some more: KC662173 : KC662175 : KT201445 : KJ999405 : KP294358 : EU593548 : EU593547

ADD REPLY
1
Entering edit mode

I'll join genomax on the kudos part :)

ADD REPLY
0
Entering edit mode

Out of curiosity what kind of evidence do you provide (negative/non-specific blast searches)? One of the records that you link above says

This record was removed at the submitter's request because the sequence may contain contamination.

You are obviously not the submitter of that record. Perhaps NCBI just uses a standard template to provide that message. Kudos for doing your part to purge incorrect information from GenBank.

ADD REPLY
1
Entering edit mode

At this point I was wondering how many of those Pan genome / comparative genome / genome announcement studies perform a contamination check before analyzing the data.

ADD REPLY
1
Entering edit mode

These ones were pretty obvious cases. I put a screenshot of the top ~30 BLAST results for each sequence in the email and if most of them are fungi and not plant sequences it's kinda obvious the uploader did not check the sequence before uploading.

I email the submitter first, because that is obviously the correct way to go. But sometimes the submitter has left academia/died/does not take action and then there is no other choice than to contact the database maintainer.

Here is a sequence I am working on right now and the submitter has answered so hopefully it will be gone soon. KT695200

But the whole process is so tiresome. I wish I could just click "report" in the sequence profile and it would trigger a manual investigation.

ADD REPLY
2
Entering edit mode
4.2 years ago

I don't think that mailing them is the correct way to go (they must get hundreds of such mails a day).

I suggest to have a look into "third party submission" (https://www.ncbi.nlm.nih.gov/genbank/tpa/ or https://www.ebi.ac.uk/ena/browser/about/policies) , that might be a better approach for what you want to do. The only hurdle I see is that it seems you have to back this up with a publication

Out of curiosity: what then happens (or is the answer) on the 1/4 of your mails that is not ignored?

ADD COMMENT
1
Entering edit mode

lieven.sterck You are correct in noting: "The only hurdle I see is that it seems you have to back this up with a publication." Indeed, https://www.ncbi.nlm.nih.gov/genbank/tpa/ says: "What should NOT be submitted to TPA: ... Annotation from in vivo, in vitro, or in silico experimentation that will not be submitted for publication in a peer-reviewed journal." But this is precisely what andzsci has, see their comment above: "I put a screenshot of the top ~30 BLAST results for each sequence in the email and if most of them are fungi and not plant sequences it's kinda obvious the uploader did not check the sequence before uploading." So it actually seems like "third party submission" is probably not the way to go for the OP, unfortunately. But your answer is hopefully useful for other users - those who are submitting evidence for publication, so +1, and thank you!

ADD REPLY

Login before adding your answer.

Traffic: 1803 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6