Hi,
I am interested in modeling a structure for my protein of interest but there were no good templates found in the database (<30% homology). I used Modeller but there were no templates found.
Thus, I changed the threshold of the e-value in a BLAST search for PDB (not sure if this is appropriate) to obtain more hits structures. I took the results of the templates' sequences in FASTA format and pasted it into a file in Modeller called pdb_95.fasta.
The query sequence was pasted in a file called query.fasta. I altered the python script within the Modeller software search_templates.py) to accept FASTA format instead of PIR format and results in FASTA format as well as shown below:
from modeller import *
log.verbose()
env = environ()
#-- Read in the sequence database
sdb = sequence_db(env)
sdb.read(seq_database_file='pdb_95.fasta', seq_database_format='FASTA',
chains_list='ALL', minmax_db_seq_len=(30, 4000), clean_sequences=True)
#-- Write the sequence database in binary form
sdb.write(seq_database_file='pdb_95.bin', seq_database_format='BINARY',
chains_list='ALL')
#-- Now, read in the binary database
sdb.read(seq_database_file='pdb_95.bin', seq_database_format='BINARY',
chains_list='ALL')
#-- Read in the target sequence/alignment
aln = alignment(env)
aln.append(file='query.fasta', alignment_format='FASTA', align_codes='ALL')
#-- Convert the input sequence/alignment into
# profile format
prf = aln.to_profile()
#-- Scan sequence database to pick up homologous sequences
prf.build(sdb, matrix_offset=-450, rr_file='${LIB}/blosum62.sim.mat',
gap_penalties_1d=(-500, -50), n_prof_iterations=1,
check_profile=False, max_aln_evalue=0.1)
#-- Write out the profile in text format
prf.write(file='build_profile.prf', profile_format='TEXT')
#-- Convert the profile back to alignment format
aln = prf.to_alignment()
#-- Write out the alignment file
aln.write(file='build_profile.fasta', alignment_format='FASTA')
When I checked the result of the build_profile.prf (generated from MODELLER), I got back only my own sequence as result. I thought I was supposed to get back some templates sequence generated by Modeller?
Did I overlook something? Can someone please advice since it's my first time using MODELLER.
Your help is very much appreciated. Thanking you in advance!
Hi, thank you for your suggestions. I generated a model using iTasser Protein Structure prediction server, is there a way to check for its accuracy of the model and can this model be used for protein docking studies?
i-Tasser gives you a confidence score, you should look at that for some assurance. As for accuracy for docking, it depends.. is the interface well modelled? if it is ab initio then the likelihood of it being plain wrong is quite high and your docking run is worthless. If for instance the interface is well aligned with something else that you deem accurate (say, a protein with the same function as yours) then it's good to go.
Try using HHPred as a good interface for MODELLER. I don't know what the error might have been, but it might just be a problem with the format of your database. Can't you set these threshold values in the modeller search itself? It would be much better.
@joao: Thanks for your suggestion. I am trying to use HHPred now and it is really helpful. =)