I don't think there's a simple answer to this, and I'd encourage you to think about what you actually want to know about the protein. It's all well and good to say that the pI of your capsid protein is x
, but is that useful? Do you care about minute fluctuations in pI which you would almost certainly lose by taking a consensus?
A consensus is useful in certain situations but not others.
pI:
Propka can give you pI for your sequences, and if you have that many, you might want to run just a consensus through - I'd say this only makes sense if your 2000 sequences are already pretty similar however.
GRAVY etc:
I believe CodonW will output a lot of parameters, including GRAVY and AA composition, but I'm not super familiar with the program.
Consensus sequences:
Consensus sequences can be gotten out of BioPython (I've actually been writing this myself recently. It's what they call a 'dumb' consensus though, it doesn't have many configurable parameters. See http://biopython.org/DIST/docs/tutorial/Tutorial.html#htoc301
from Bio.Align import AlignInfo
from Bio import AlignIO
alignment = AlignIO.read('~/path/to/alignment.aln', 'format') # where format is a supported type (see BioPython docs)
summary_align = AlignInfo.SummaryInfo(alignment)
consensus = summary_align.dumb_consensus()
Yes! I usually use this when analyzing physical properties of proteins...