Core proteome using local Blast

0

Entering edit mode

5.2 years ago

zarodkip • 0

Hello all,

I am trying to establish a core proteome of A. baumannii, ie the proteins all of the strains have in common. I have multiple .fasta proteome files.

What would be the most appropriate way of going about this? would this do the trick?:

blastp -query query.fasta -db db -out output.txt -outfmt "6 qseqid qlen sseqid  salltitles pident mismatch gapopen qstart qend qcovs  sstart send evalue bitscore" -evalue 0.00001 -max_target_seqs 5 -num_threads 4

Also, is there a way to run one vs all of my proteomes blast and not one vs one proteome?

I should add I am super new to local blast and using any kind of coding.

blast blastp proteome core local • 1.2k views

ADD COMMENT • link updated 5.2 years ago by h.mon 35k • written 5.2 years ago by zarodkip • 0

0

Entering edit mode

What format is your data in? Multi-fasta protein sequence files one per strain? If these are very similar strains (and your dataset is reasonably complete in each case) then you may be able to use CD-HIT to come up with a non-redundant set of proteins which would be equivalent to core proteome.

ADD REPLY • link 5.2 years ago by GenoMax 152k

0

Entering edit mode

Yes, they are multi-fasta sequence files one per strain. I'll look into CD-HIT. Thank you.

ADD REPLY • link 5.2 years ago by zarodkip • 0

0

Entering edit mode

I recommend hmmscan (or was it the other hmmsomething) from hmmer against pfam. You get far easier results to interpret this way..

ADD REPLY • link 5.2 years ago by 5heikki 11k

Login before adding your answer.