Hi, I have a basic question about the accession number in reference sequence. For non-redundant protein is WP_ and for nucleotides is NZ_. Here I got thousands of protein sequence with different accession number. My goal is to find out if certain proteins are from the same genome, which means if some WP numbers share the same NZ_ number, is it possible to do this?
Thanks
give us some examples...
So like for WP_069981055.1, which is protein electron transporter RnfB from Geosporobacter ferrireducens , but how can we know that where does it translated from? I just search the name Geosporobacter ferrireducens in gene database and it gives me two refseq number for genomes, one is NZ_CP017269.1 and the other one is NZ_CP017270.1, I don't know which one is the genome that translate this protein. And I also don't know if it's reasonable to use this non-redundant protein number to search for a genome because probably this is annotated on many different RefSeq genomes. So i don't know how to solve my problem.
This particular example (
WP_069981055.1
) is annotated from a single genome based on the examination of the record. Take a look at the NCBI help page for RefSeq non-redundant protein categories which describe how the entries will appear in the full records (single species, multi-species and multi-species (bacteria and archaea)).This WP_069981055.1 protein is annotated to the protein of a single organism which is Geosporobacter ferrireducens. talking about which one of either two genes NZ_CP017269.1 or NZ_CP017270.1 translate your protein we don't now because based on current annotation knowledge for Geosporobacter ferrireducens we cannot decide yet which gene translate your protein. In can be that either one translates your protein or both. You have to empirically validate this using knock-out experimentation for example