Question

Downloading sequences and their annotations for generating a sequence similarity network in Cytoscape

0

Entering edit mode

8.6 years ago

ehzed ▴ 40

Hello,

I need to generate a sequence similarity network from proteins with a similar domain architecture (Ex. Find all sequences with 4 DNA binding domains at the N-terminal region). While I can find all the sequences with similar domain architectures on Interpro or PFAM, I am having trouble linking that to their annotations in a systematic/automated way. Considering there are thousands of sequences, I know I need to write some sort of script to do this. But first I guess I need to some idea of where to get gene annotations from, how to associate that with each sequence, what kind of format should I use to display annotation and sequence (tab-delimited?). I've also read a couple of websites and a paper and they said I need everything in xgmml format to input into Cytoscape. However, I have found very little documentation on how to generate this xgmml format. So I was wondering if anyone can give me some general directions (databases to download sequences, how to organize annotations, etc), thank you!

These are two of the references I've looked at so far: http://enzymefunction.org/resources/tutorials/efi-and-cytoscape3 http://journals.plos.org/plosone/article?id=10.1371/journal.pone.0004345

alignment sequence • 1.5k views

ADD COMMENT • link updated 8.6 years ago by Biostar 20 • written 8.6 years ago by ehzed ▴ 40