Question

How to interpret blastn local output

0

Entering edit mode

4.6 years ago

chantalrenauminguez • 0

Hi! I'm new in BI. I just used blastn to compare two big fastas (more than 3.5M bp). I want to see their differences and where they are, because both are the sequencing of the same sample but with different approaches.

So I used this command on blastn: blastn -query assembly.fasta -subject renamed.fasta -outfmt 6 -out results.txt

And it worked and showed this output( i just show a few rows but there are more than 500).

contig_1    ctg.s1.F.arrow  99.999  3389370 6   17  1   3389361 3389351 1   0.0 6.259e+06

contig_1    ctg.s1.F.arrow  100.000 599564  0   1   3389362 3988924 3988920 3389357 0.0 1.107e+06

contig_1    ctg.s1.F.arrow  99.912  5672    2   3   992424  998093  3383686 3389356 0.0 10443

contig_1    ctg.s1.F.arrow  99.506  5669    8   10  1   5666    2391274 2396925 0.0 10296

contig_1    ctg.s1.F.arrow  99.904  5182    5   0   1783880 1789061 452377  447196  0.0 9542

contig_1    ctg.s1.F.arrow  99.904  5184    1   3   2936985 2942166 1605481 1600300 0.0 9542

contig_1    ctg.s1.F.arrow  100.000 3343    0   0   1150782 1154124 1531471 1534813 0.0 6174

contig_1    ctg.s1.F.arrow  100.000 3343    0   0   1854548 1857890 2235238 2238580 0.0 6174

contig_1    ctg.s1.F.arrow  100.000 2674    0   0   2933585 2936258 87318   89991   0.0 4939

If I am not wrong, each column means :

 1.  qseqid  query (e.g., unknown gene) sequence id
 2.  sseqid  subject (e.g., reference genome) sequence id
 3.  pident  percentage of identical matches
 4.  length  alignment length (sequence overlap)
 5.  mismatch    number of mismatches
 6.  gapopen     number of gap openings
 7.  qstart  start of alignment in query
 8.  qend    end of alignment in query
 9.  sstart  start of alignment in subject
 10.     send    end of alignment in subject
 11.     evalue  expect value
 12.     bitscore    bit score

But.. How do I interpret this? Where are the differences? How so rows needed? Is the BLAST comparing everytime a closer region to define where the differences are?

Thanks a lot if you help with this matter :)

next-gen sequence sequencing • 1.2k views

ADD COMMENT • link updated 4.6 years ago by lieven.sterck 15k • written 4.6 years ago by chantalrenauminguez • 0

score 0 · Answer 1 · 2021-01-12

0

Entering edit mode

4.6 years ago

lieven.sterck 15k

if you are new to all this it will pay off to read up on this specific topic a bit.

here (https://www.ncbi.nlm.nih.gov/books/NBK1734/ ) is a good start.

if then you have further or more detailed questions we'll be happy to help you out.

ADD COMMENT • link 4.6 years ago by lieven.sterck 15k