Blastx With Xml Format
2
0
Entering edit mode
12.5 years ago

Hi all, I'm currently using standard alone blast, the version is ncbi-blast-2.2.25+. I tried to made xml format blast result for my downstream analysis, but I had some problems in generating it.

The command I use is:

"blastx -query shorttest -db ~/blastdb/nr -evalue 1e-3 -max_target_seqs 20 -num_threads 24 -outfmt 5 > t.xml"

The command keeps running but nothing has been written to t.xml, and if I use other output format like default, or -outfmt 6, it will print out the correct result.

Thanks very much!

EDIT:

Yes it's transcriptome data, and I was trying to annotate the whole transcriptome through blast2go, and the first step I need to do is doing blastx against nr database.

I understand there might be stream for blastx, but I can wait for 24 hours and get nothing from it, while if I use other option like "-outfmt 6" I can get results in minutes, so it's likely stream is not the issue.

And I tried to use a small query against nr database, or keep the same query but against small local database, both gives me results, so I'm not sure whether it's the bulky size of both query and database and the format that cause the problems, thanks very much!

EDIT2:

thanks~ My doubt about the xml format is that it waits until everything finishes and then print out the results, since it has tags from the start and end, since my files are too big that they can't finish in one day, which is the time quote for our single job on the server, and as a result it print out nothing.

Any suggestions will be appreciated.

EDIT3:

Guys, thanks all for the suggestions, finally I decide to adopt the strategy that firstly I use "-outfmt 11" to generate standard asn file, and then use blast_formatter which is given by NCBI, to convert the result into xml format. I tested this on small sample result, it works fine, hope it works for my whole data set.

blast • 6.7k views
ADD COMMENT
2
Entering edit mode

Have you tried running it with "-out t.xml" instead of "> t.xml"?

ADD REPLY
0
Entering edit mode

Philipp is right, you should use -out instead of redirection. If you need detailed command option, please type "blastx -help"

ADD REPLY
0
Entering edit mode

First things first : is your data made of NGS reads? If so, see my previous comment.

If not, I don't think your BLAST search will be finished in 24 hours, so independently of the output format, you should talk with your system administrator on this job time limit.

Finally, concerning XML format and other output formats, I don't know how XML output is written to the disk but I am 100% positive that outher outputs are kept in RAM and then written in batches. BLAST doesn't constantly stream output and it does not also write all the output in one go. This means, you should double check that your query is complete in both XML and other outputs, as having some info in the output file is not synonym of a completed query.

ADD REPLY
0
Entering edit mode
12.5 years ago

Thanks, I tried -out, actually I found that it works in either way, ">" or "-out", the problem is it only works for small size of queries, I can use this to generate xml format for 10 or 50 sequences, but when I tried to do this on 100,000 sequences, it print nothing to the file.

ADD COMMENT
2
Entering edit mode

I am pretty sure blast doesn't constantly stream out the results. I've ran blast on large files before and have noticed it can take a few hours before I actually see anything in the output.

ADD REPLY
0
Entering edit mode

Are you sure you waited until BLAST finished with the whole query? I have already run very large queries without any problem, it just takes a lot of time... depending on the size of your query sequences, possibly days.

Are these 100,000 sequences reads from NGS sequencing? If so, there are much more appropriate tools to map these sequences such as bwa, bowtie, ... If you could tell us more on what you are trying to achieve, we might be able to help you find an appropriate tool.

ADD REPLY

Login before adding your answer.

Traffic: 1704 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6