Entering edit mode
10.8 years ago
Steve Barratt
▴
30
(My first Q on biostar, forgive any transgressions please!)
I would like to blast a protein/aa sequence and filter(?) my results to include only the top result from each organism. I'm keen to use R or python to accomplish this if they're sensible choices.
Thanks
what's your output ? does it contain the name of the organisms ?
according to this page: http://blastedbio.blogspot.co.uk/2012/05/blast-tabular-missing-descriptions.html
You can get output from blast+ in the form of "Subject Taxonomy ID(s), Subject Scientific Name(s) and Subject Common Name(s)"
I am sort of asking two questions really: firstly does something exist to do accomplish what I'm after 'off the shelf'? And if 'No.', were I to 'make something' what should my general (or exact!) approach be?
Hi Steve, May not be as complete or clear a reply as you were looking for, but I recently wrote a short, informal web doc, demonstrating two ways to run a blast through R. To my mind, the most flexible/useful is the second, which basically uses blast+ command line through R. While my example demonstrates a nucleotide blast, it would not be difficult for you to alter some arguments to do what you want.
Thanks KKeenan02 I just skimmed your doc, that's really useful. Bookmarked!