Gene Cluster Annotation
0
1
Entering edit mode
4.7 years ago
genome_man ▴ 10

Hey everyone,

I have my bacterial genomes, however I want to annotate these with specific gene clusters, for which I have their respective .gbk files. Is there any programs/software to do this? Is it possible with prokka?

Thanks :)

genome annotation gene prokka mauve • 1.5k views
ADD COMMENT
2
Entering edit mode

I think you can use bowtie2 or other mapping tools to map the gene clusters to your reference genomes.

If you're trying to annotate the genes in these reference genomes you can use Prokka or Fraggenescan, but they won't give you any clusters. You can cluster them using cd-hit or other tools, or you can directly compare the genes that you get with the gene clusters that you already have.

https://metagenomics-workshop.readthedocs.io/en/2014-11-uppsala/functional-annotation/prokka.html

ADD REPLY
0
Entering edit mode

Hey @Fatima!

Thanks for following up, I recently saw a similar post to this. I used the MCL algorithm to obtain my clusters, for which I have seperate .gbk files for each respective gene cluster. I just basically want to annotate these back to my reference genomes (these I already have run an alignment on)

What I have:

1) Original Genome Files (Can be called my reference genomes)

2) Gene Clusters (.gbk files obtained from reference genomes with MCL)

What I want:

Annotate where these gene clusters are on my genomes (I want to see the location of the Gene Clusters on my genomes) for which Iguess in this case I can possibly use prokka but I'm not sure how to use this with my command line (I am trying to run from a remote server with conda)..

Thanks :)

ADD REPLY
0
Entering edit mode

Hi :)

I haven't used Prokka, but I think you can install it using:

conda install -c conda-forge -c bioconda -c defaults prokka

https://github.com/tseemann/prokka/blob/master/README.md

However, I have used Fraggenescan if you decide to use it I can help. See this post for more information:

C: Finding gene matches using sequence reads?

Input: Reference Genomes

Output: Predicted gene coordinates in .out and .gff format. Gene sequences in ffn format. Protein sequences in faa format.

Sample gene coordinates (.out):

https://omics.informatics.indiana.edu/FragGeneScan/result/genome/NC_000913.gff

ADD REPLY

Login before adding your answer.

Traffic: 1791 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6