Question

How to see sequences in CAFE analysis gene families?

0

Entering edit mode

7.6 years ago

nancydong20 ▴ 130

Hello everyone!

I have started to learn CAFE analysis. From what I can see, the output of CAFE labels the gene families with only numbers. There appears to be no file telling me which sequences are clustered in each gene family, so I can go back and analyse the identities of the significantly expanded/contracted families?. I'm assuming that MCL is responsible for clustering the gene families, so that the CAFE tutorial that I am following (http://evomicsorg.wpengine.netdna-cdn.com/wp-content/uploads/2016/06/cafe_tutorial-1.pdf) did not cover how to see which sequences belong to which family?

Thank you very much!

CAFE Gene families • 2.7k views

ADD COMMENT • link updated 7.6 years ago by Ben Fulton ▴ 150 • written 7.6 years ago by nancydong20 ▴ 130

score 4 · Accepted Answer · 2017-05-05

Yes, you have to go back to MCL (or whatever clustering program you used), to see which genes are in which family. There is a slightly newer tutorial with supporting files that you can access from the CAFE web site. If you look there at the MCL dump file you should see the information you need. If you're using your own data and you don't have that file, you'll have to re-run blast and MCL to find which sequences belong to which family.

Note that you can still tell which families are expanding and contracting without knowing the identity of the sequences inside. The identity of the gene families can be determined without the dump file, but not the identity of the family members.