I have ~ 400 text files of BLASTp results.
P_1L_GRL111.out (contents)
NP_387917.1 ADZ06570.1 44.29 289 153 4 2 289 1 282 7e-77 236
I have a file that reads in files ALL my P_1 files, sorts the e values, and then calculates a threshold. I would like to convert this into a function. I have made some attempts to define a function that can read in all my files, but as of now, the files only reads ALL P_1 files, or ALL P_2 files. I also feel like the function is not sorting the values correctly.
def threshold_proteinsVStarget(protein_Files):
# List of Lactobacillus databases created from BLAST commands
Lactobacillus_DB = [ 'L_GRL1112', 'L_214','L_CTV-05','L_JV-V01','L_ST1','L_MV-1A', 'L_202-4','L_224-1','L_JV-V03', 'L_MV-22','L_DSM_13335', 'LactinV_03V1b', 'SPIN_1401G', 'UPII_143_D', 'L_1153','L_269-3', 'L_JV-V16','L_49540']
list_e_values = []
for prot in protein_Files:
for db in Lactobacillus_DB:
line = open (prot + db + '.out').readline()
print line
line2 = line.strip('\n')
fields = line2.split('\t')
e_val = float(fields[10])
list_e_values.append(e_val)
return list_e_values
file1 = ['P_3']
print threshold_proteinsVStarget(file1)
This is very useful.... I was trying to work with the glob module, but I initially thought it would be best to combine ALL my files into one text file, then sort, then return the e-value...but that was not working properly.
Using python to merge your files is unnecessary. Just cat all your files into one, then run your code on it, then.
cat *.out > allblast.out
Then you can use my above code, without the for loop with glob, and just open and read the file, append each evalue to a list, and sort it.
Thanks st.ph.n! My last question, if I write another function to check e value, how can I append the first two columns with the e value.
My last step:
if eval is less than threshold: # append fields [0], [1],[10] print NP_387917.1 ADZ06570.1 7e-77
See my updated answer. Replace X with the value you want. This will check the evalue as it loops through each line, instead of appending to list. If you need a list, use a tuple or dictionary:
The updated answer seemed to do the trick! How could I transform the updated code into a function?
if all you're going to do is return a list from the function, you don't need to use a function. Just append everything to a list as int he comment. If you do need function, it should be trivial for you to do so. If this answer solves your problem, please accept the answer.