Taking over projects from someone, I got a list of identifiers for expression data annotated with prokka as shown as below
PNJECBGM_02289 gnl|Prokka|PNJECBGM_42
PNJECBGM_02290 gnl|Prokka|PNJECBGM_42
BKKAOALG_00637 gnl|Prokka|BKKAOALG_9
and something that show the locus tag, CDS and products...
ID=BPFNMOJC_00555_gene;Name=hisZ_2;gene=hisZ_2;locus_tag=BPFNMOJC_00555
Does anyone has any suggestion on how can i find out the name of the prokaryotes that own this protein/genes identifiers? I tried Uniprot, Genbank, esearch from NCBI, PFAM database and i got nothing in return? I looked at the gbk files as well but it is absolutely not helpful:
VERSION:
KEYWORDS:
.
SOURCE: Genus species
ORGANISM : Genus species
Unclassified.
COMMENT Annotated using prokka 1.14.6 from https://github.com/tseemann/prokka.
FEATURES Location/Qualifiers
source 1..24892
/organism="Genus species"
/mol_type="genomic DNA"
/strain="strain"
gene 52..885
/locus_tag="BPFNMOJC_00001"
mRNA 52..885
How can the expression data be annotated with prokka? You probably mean the result of the assembly was annotated with prokka and then expression analysis was done using that annotation? What types of annotation files do you have? You have this tagged with
MAG
so is this metatranscrptomic data?yes you right, this is a metatranscriptomics data :) I have gbk files for each MAGs , but none of them tells what the name of the prokaryotes are