Hello, hoping I can get some help or insights to good tutorials to understand further as I'm pretty new. Please let me know if this question is even possible:
Q1) I'm trying to follow Shmakov et al. CRISPRicity pipeline to adapt it for a different purpose and am having trouble implementing some of the methods
a) "The translated prokaryotic database was searched with PSI-BLAST (95) using the previously described CRISPR-Cas protein profiles. " Shmakov et al 2018,
b)"The most complete available set of PSSMs is currently available through the CDD database [25]... and the list of these PSSMs and the correspondence between the CDD PSSMs and the respective Pfam and TIGR families also can be found in [5] and at ftp://ftp.ncbi.nih.gov/pub/wolf/_suppl/CRISPRclass/crisprPro.html. " Markarova et al. 2015
I am using a different translated database and I am thinking that for each sequence I have, I should run PSI-BLAST with the PSSMs already computed. But how do I concatenate multiple PSSMs and blast against this singular PSSM? Is this even a thing?
Q2) Where would I find these PSSM?
I tried an example PSI-BLAST to see what an pssm output would look like and it was saved as a .asn. I can't find something like that on this site (see attached photos). Would this have to mean that when I search up the JCVI/Pfam profile of interest, I should download the alignment from pfam, do that one by one for all of crispr/cas profiles I'm interested in, then generate a new PSSM from an MSA of the alignments I just collected? Then from there run a blast on my sequence of interest with this new PSSM that has been generated?
Thanks for your help and if there is a paper or text that already explains something like this, please send me a recommendation, thank you!
If anyone has thoughts, would be much appreciated