Hi all,
Sorry for this long question, but I have facing this issue due to my hardware limitations(I am using windows 7 machine (32 bit) with 4 gb of ram).
I have a random number (and with random name) of .fa files within a folder named 'seq', each of which containing only a single fasta protein sequence, as:
NP_4500.1.fa
NP_4568.1.fa
NP_45981.3.fa
XM_we679.fa
36498746.fa
in another folder named 'db', I made a database fragmented in 200 segments (due to my computational limitations) which are arranged as:
hg.part-001.db
hg.part-002.db
hg.part-003.db
..
..
hg.part-200.db
now I want to run usearch of each sequence against the fragmented database and generate fragmented result, as for one fa file (NP_4500.1.fa):
usearch -ublast ./seq/NP_4500.1.fa -db ./db/hg.part-001.db -evalue 1e-10 -accel 0.5 -blast6out NP_4500.1_part-001.out
usearch -ublast ./seq/NP_4500.1.fa -db ./db/hg.part-002.db -evalue 1e-10 -accel 0.5 -blast6out NP_4500.1_part-002.out
usearch -ublast ./seq/NP_4500.1.fa -db ./db/hg.part-003.db -evalue 1e-10 -accel 0.5 -blast6out NP_4500.1_part-003.out
...
...
usearch -ublast ./seq/NP_4500.1.fa -db hg.part-00200.db -evalue 1e-10 -accel 0.5 -blast6out NP_4500.1_part-00200.out
After that, I want to merge the results in a single file as:
join NP_4500.1_part-001.out NP_4500.1_part-002.out .. NP_4500.1_part-00200.out > NP_4500.1.out
similarly for next seq:
NP_4568.1.fa
...
Now, I can run a cmd script for each fasta fike as:
for %%F in ("*.fa") do usearch -ublast ./seq/%%F .......
But my question is, how can I integrate this command with each of the fragmented database and merge the .out files to generate result for a single sequence before proceeding to the next.
I can use cmd, perl or python script. Thanks for ur consideration.
Apart from the original problem that should be solvable by a batch script, I would consider to simplify your life. I propose you can spare yourself a lot of hassle by upgrading to a better computer. A few aspects that make your setting much more difficult than it had to be:
Thanks for reply.. I'll upgrade my machine soon, but for now i need to split the db as -makedb in 32 bit usearch cant handle my uniref database (20 gb). And I am avoiding ncbi-blast simply because it is too slow for my requirement (vs ublast)