Entering edit mode
4.8 years ago
elesbb91
▴
70
When using the tab output format, it causes multiple outputs of the same hit to show, but with different %identities and other attributes.
When using the standard output, the results are what I find on the web based version of the blastn tool.
How can I create a tabular form of results without getting all these duplicate hits?
Are you referring to different HSP being reported as separate entries?
You probably haven't used any thresholds/criteria and that's why you are getting many hits, you can add these options:
And
You can replace 3 with any number lower than 300?
see other options in
No, -num_alignments still will show partial alignments in table format
Even if there's a global alignment for that query that has a better identity score?
yes, that will show the full alignment in sections, for example, if I modify a little a Ferroxidase:
(if is not obvious I'm adding a "AC" repeat line)
then I run Blast:
Oh, okay! It had worked for me in the past. I used a dataset of genes and a comprehensive set of genes from bacteria, and using -num_alignments 1 and -evalue 1e-300 I only got one hit for each gene!
Take a look at my comment, I just don't understand how changing the output format will change the hits too
So, this is all great news and stuff.. But, why when using standard output (not adding the output parameter) those other hits are not included? If I use
-outfmt 7
I get more hits? Like, a heckin load more. It seems odd because the ONLY thing that is changing is the outfmt..My only guess is that if
outfmt
changes, does it cause other parameters' defaults to change?Here is the output:
STANDARD:
TAB OUTPUT:
For example, XM_017006782.1 is listed twice in tab format but NOT in standard format.
It is listed once in the summary but both HSP (high scoring segment pairs) are displayed when you scroll down to the pair-wise alignments in HTML format. Since alignments are not shown in format 7 you get two entries for the two HSP's.
In standard output, I do not see both alignments either in the summary nor in the pair-wise alignments. This goes for the HTML version too. Even the link you provided I am only seeing one hit for XM_017006782.1 as well as in the alignments view. I seriously am so lost xD
NVM I see them. So they are just matches that were found in different areas of the query?
You either need to scroll down a ways or use the
Next Match
link that you see in the screenshots below.They are called
High Scoring Pairs
(HSP's).I think in standard format and in online blast, it outputs both alignments both only shows the max score and total score. For example in online blastn if you align XM_017006770.1 and XM_017006782.1 you'll get
MAX score and total score, and two different alignment one from 1 to 7010 and one from 7008 to 8006 (with the same coordinates and score as your table). I believe you should have two alignments in your standard format too.
I'm trying to say that the hits are not different, but the format is the different.
Yes, I do not have two alignments in my standard format. I only see the two in the tabular format. This is the whole problem lol. I have no idea why. I also do not see both alignments in the online blast either, even if I download the results.
So I figured it out. Thank you guys. The second hits were partial alignments of the same strand as JC posted. So I understand now the differences. Thank you everyone!