I have Windows OS and need to know if someone can help me write a few Bash scripts on which I can run some blastp queries. The first one is as follows: I need to know for all 6-amino acid long peptides, made by combining all possibilities of all 20 human AAs, is there any human protein that does NOT align well with one of those peptide. That means first all possibilities of 6-AA long peptides (made from the 20 human AAs) is figured out. Then those peptides, in a FASTA format, is submitted to the ncbi's nr database (with homo sapiens being as the organism) and a blastp is run. The e-value of the blastp 'algorithm parameters' should be set as high as 20000 to be able to see all the possible alignments/misalignments. My assumption is that for a 6-AA peptide, it is very unlikely to find a human peptide that aligns LESS than 4 amino acids (the placing of the AAs doesn't matter in my research). I highly appreciate if you would direct me on how to run this script in Windows. This is actually part of a HUGE academic research.
What do you need help with - writing the scripts or writing them for Windows? We could point you to resources for the latter, but the former is something you'll need to read manuals and try to write on your own - we do not provide ready to use code.
Whatever help you can provide is appreciated. I do need to learn how to write the script and I don't have that much time but I'll try. Yes, I use Windows 11 64-bit and need to know how to download the ncbi nr database, how to navigate through the command line to find the right directory for blastp, and then hopefully there's a help command line to get me where I need to go (i.e., segmenting peptides and running queries).
Since you are referring to using windows 11 just a word of caution. NCBI
nr
pre-made database indexes are hundreds of GB (over 600). You will not be able to search against this on a desktop.