Question

comparing three genomes?

0

Entering edit mode

10.0 years ago

kxd419 ▴ 10

Hello,

I have sequenced a bacterial genome. I want to use a venn digram comparing it to two other already sequenced genomes. Something like this: http://www.ncbi.nlm.nih.gov/pmc/articles/PMC2953697/figure/f2/

However I am unsure how to go about it. Should I used my genome as a blast db then blast the two known genomes against this? If so do I then take the genes p resent and absent in both known genomes and blast them against each other?

Kind regards,
KXD

gene genome blast • 3.8k views

ADD COMMENT • link updated 2.8 years ago by Ram 45k • written 10.0 years ago by kxd419 ▴ 10

score 2 · Answer 1 · 2015-04-29

2

Entering edit mode

10.0 years ago

5heikki 11k

One option would be to query the proteins of the 3 genomes against e.g. pfam (with hmmer) and then extract the number of shared features between the proteomes from that. Also, maybe they tell in the text or MM what they actually did there..

ADD COMMENT • link 10.0 years ago by 5heikki 11k

Ram · Answer 2 · 2015-04-29

0

Entering edit mode

10.0 years ago

HG ★ 1.2k

Simple way:

Annotate the Genome
Cluster the gene (cd hit/orthomcl...)
Find the share gene among all genome
Draw a venn diagram (maybe using http://bioinfogp.cnb.csic.es/tools/venny/)

ADD COMMENT • link updated 2.8 years ago by Ram 45k • written 10.0 years ago by HG ★ 1.2k

0

Entering edit mode

Hi HG,

Thanks for your reply.

The genome is annotated however all three genomes have different gene names.

Can you explain step two in more detail?

ADD REPLY • link updated 2.8 years ago by Ram 45k • written 10.0 years ago by kxd419 ▴ 10

0

Entering edit mode

Extract all the gene from each file > blast all vs all (with your desire cutoff value using cdhit) > You will get a list unique sequence and share sequence > count the number and plot

http://weizhongli-lab.org/cd-hit/

ADD REPLY • link updated 2.8 years ago by Ram 45k • written 10.0 years ago by HG ★ 1.2k

0

Entering edit mode

I don't see this for a set of three transcriptomes, it only presents option for comparing two nucleotide databases, can you explain how you do this if you have three databases? Thank you

ADD REPLY • link 8.2 years ago by Illinu ▴ 110

0

Entering edit mode

we are talking here "bacterial genome" not transcriptomes!!!!

ADD REPLY • link 8.2 years ago by HG ★ 1.2k

0

Entering edit mode

and what is the difference for you? It still consists of fasta files with sequences right? The question is how do you do it for three sets of 'genes' (if you want) instead of two. The problem is that what you propose doesn't work when the gene names are not the same, and when there is gene expansion number in one genotype versus another. Also you need to do best reciprocal blast not just all_vs_all

ADD REPLY • link 8.2 years ago by Illinu ▴ 110