If I type this in my shell:
blat /myseq/hg19.fasta nonref.fa -out=blast8 blat.blast8
I get a valid output. However, if I try to automate this in python 3.5:
proc=Popen(["blat",human_ref,"nonref.fa","-out=blast8","blat.blast8"],stdout=PIPE,universal_newlines=True)
It will run and load the entire genome but it won't read the query file. This is the output I captured:
>>> Loaded 3101804739 letters in 84 sequences >>> Searched 0 bases in 0 sequences
I have tried with shell=True
and also submitting the cmd as a single string but they do not work as well.
Update:
For some reason by removing stdout=PIPE
and the universal_lines
arguments it now works. I first noticed from looking at old stackoverflow posts on this issue someone used subprocess.call
. I noticed that it is a wrapper for Popen
. I tried call
and it worked even giving me stdout automatically without me having to capture it. I did the same thing with Popen
and it worked as well. It seems the culprit was the stdout=PIPE
argument. I really don't understand it but I'm just glad it works at this point.
Final Update
Thanks for the help John. Seems like I was an idiot and didn't close the file handle right before telling blat to read the file. My final testing of the code shows that you can have stdout=PIPE
and it will work as well. Thanks again for trying!
Just a total guess since i've never run blat from python, but i suspect it knows you're not running blat in a terminal and is expecting the nonref.fa from the stdin.
First I would try with shell=True and a static string, so just: proc = subprocess.Popen('blat /myseq/hg19.fasta nonref.fa -out=blast8 blat.blast8', shell=True) and nothing else. It will print to the stdout/err which is fine for testing.
Then try it again with stdout=subprocess.PIPE. If that changes things then the issue is fixable but i won't type it out unless that really is the reason.
The static string runs my script and then I see the correct output with correct input given. When I add the
stdout=PIPE
withuniversal_newlines=True
, it gives me the same 0 bases in 0 sequences.Sorry to be a pain, but can you remove the universal_newlines=True so we can tell if it's just the stdout=PIPE? I think that's the culprit but I don't want to make a mistake.
I had to add that so I could capture the output as a string:
if I added
b''
it still gave me byte conversion to str errors. I was also trying to abide by not usingshell=True
in their recommendations. Let me try your suggestions and get back to you. Thank you!edit: I did not realize
universal_newlines
was for the input! I thought it was to receive the output as string.