Pseudogenes in bacteria genome
2
0
Entering edit mode
6.3 years ago
agata88 ▴ 870

Hi all!

I have a problem with defining pseudogenes in bacteria genome. I defined pseudogene as an another copy of gene in genome.

Because my genome is bacteria I don't have any introns, so every same annotation for one gene will be an extra copy - pseudogene. I have 2000 unique genes and 454 repeated at least once. Going this way I found around 1000 pseudogenes. In comparison to other related species this amount is huge - that's why I am suspicious about my results.

So my questions are:

*Which one of defined pseudogenes represent gene and have functionality? How can I find them?

*This may be a stupid question but: Is it possible to have two same annotated genes divided by nucleotides in bacteria genome (one next to another with break)? If yes, is it one gene or gene and its pseudogene? Example below:

gene A_1-AGTCTATGTA-gene A_2

Many thanks for any suggestion.

Best, Agata

pseudogenes • 3.3k views
ADD COMMENT
4
Entering edit mode

I would disagree with your definition of a pseudogene. I think a pseudogene is when one is duplicated and also deactivated through mutation. It may be lost in some future generation, or conserved for structural reasons, but it should not generate proteins. Generating a protein would promote it to a full gene. So look for deleted start codons, damaged regulatory elements, and evolutionary conservation.

Otherwise, you're looking at duplications that may well be functional. Perhaps that bacteria wants to have doubled expression of that protein, so it has two copies run in succession. It's not a pseudogene at all.

ADD REPLY
1
Entering edit mode

IMO any ORF that is never transcribed to mRNA can be described as a pseudogene. It doesn't have to exist as multiple copies or anything..

ADD REPLY
0
Entering edit mode

So, when I have one gene that occur 10 times in genome in different contigs - it can be all functional genes?

ADD REPLY
1
Entering edit mode

Absolutely.

ADD REPLY
0
Entering edit mode

Yes, but if real, I would guess it is a transposase or something similar. Did you try to annotate the duplicated genes?

ADD REPLY
0
Entering edit mode

yes, I annotated by prokka.

ADD REPLY
0
Entering edit mode

Is this a genome that you have assembled yourself? Could these be assembly artifacts?

ADD REPLY
0
Entering edit mode

I don't think this is an assembly artefacts.For de novo assmebly I used SPADes and for artifacts removal - blastn and specific genus nt database to select contigs of interest.

ADD REPLY
0
Entering edit mode

Is the genome a closed single circle? If not then your don't have a complete genome/assembly. It is still a subject for further refinement.

ADD REPLY
1
Entering edit mode
6.3 years ago
h.mon 35k

edit: are the genome from this question the same you referred to as contaminated on this post: A: Prokka bacteria genome annotation ? If yes, then you need to re-evaluate your contamination removal, everything points it didn't do a proper job.

Regarding you specific questions:

Which one of defined pseudogenes represent gene and have functionality? How can I find them?

As karl.stamm stated above, you have to look at the gene structure to investigate the question.

Is it possible to have two same annotated genes divided by nucleotides in bacteria genome (one next to another with break)? If yes, is it one gene or gene and its pseudogene?

Yes, it is possible. To define if both are functional or not, you have to resort to the answer to your first question.

Now for your genome in particular: how did you detect the duplicated genes? How did you assemble and annotate the genomes? Do you have good sequencing coverage and a good assembly? Did you sequence a bacterial isolate?

Because, indeed, the number of duplicated you found is large and makes me suspicious of an analysis artifact, rather than being truly duplicated genes.

ADD COMMENT
0
Entering edit mode

Yes, I am still having contamination in data..ehh... looks like it is not a trivial task. Thank you all for your help. I need to reanalyze it again.

ADD REPLY
0
Entering edit mode
4.8 years ago
Fatima ▴ 1000

You might be able to try your method on https://www.ncbi.nlm.nih.gov/nuccore/AL450380.1, then download its gff3 file, count the /pseudo or pseudogene annotations and compare.

ADD COMMENT

Login before adding your answer.

Traffic: 2588 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6