How to find the number of contigs that do not match with any sequence of nucleotide database during blast search. Sorry for incomplete question. The output file format is in tab delimited i.e txt file. I could not find any answers in the online search.
This is another example of a question on Biostars that does not contain enough information to get an answer the first time around. You should include what format your blast output is in (since there are so many) at a minimum. Have you done any effort via a simple web search to see if a solution is already available?
I am going to close this question until you add this information to your original post (use the edit option on original post). We will open the question back up once you do that.
That is simple. Depending upon the format of your blast output, extract the headers of the sequences that have at least one hit to blast nucleotide db. Extract the headers from the contig fasta file and use reverse grep to find the sequences that do not match anything.
If file1 is the file with the blast headers and file2 is the file with all contig headers then you can run
Also assuming the OP is on Linux, and is only interested in the sequences hitting or not: (This would be col. 1 of a '-outfmt 6'.) Pull the query ids out (and double check unique ids) with cut -f 1 blastoutput | sort | uniq > contighits.txt Then pull out the contig headers: grep -e '>' contigs.fa > contigids.txt && sed -i 's/>//g' contigids.txt; then you can use @Sej Modha grep line to get the differences: grep -vf contigids.txt contighits.txt.
This is another example of a question on Biostars that does not contain enough information to get an answer the first time around. You should include what format your blast output is in (since there are so many) at a minimum. Have you done any effort via a simple web search to see if a solution is already available?
I am going to close this question until you add this information to your original post (use the
edit
option on original post). We will open the question back up once you do that.Hello sukesh1411!
Please provide complete description/additional information.
For this reason we have closed your question.
If you disagree please tell us why in a reply below, we'll be happy to talk about it.
Cheers!
Many blast output formats are tab-delimited text. Can you post a snippet of example (or tell us what -outfmt number you used)?