Hi Everyone,
As it turns out, I don't understand how alignment works. I am hoping someone can help me understand it better. I have a custom genome assembly that I aligned to hg38. Zooming into a particular region, I see 3 alignments:
I tried to search for this particular sequences "TCAATTTTGAAATCTGCAAAGTATTCATTA" in my genome assembly fasta that shows coverage of 3 using blast. I was expecting to have 3 matches but then it only returns 2 matches.
I am confused. I was expecting the search to return 3 matches since there are 3 alignments.
Can anyone please help me understand why it only returns 2 matches instead of 3?
Thank you in advance for your help!
blast result:
Database: scaffolds_FINAL.fasta
99 sequences; 2,916,671,763 total letters
Query= > test2
Length=30
Score E
Sequences producing significant alignments: (Bits) Value
scaffold_8 56.5 2e-07
>scaffold_8
Length=141881116
Score = 56.5 bits (30), Expect = 2e-07
Identities = 30/30 (100%), Gaps = 0/30 (0%)
Strand=Plus/Plus
Query 1 TCAATTTTGAAATCTGCAAAGTATTCATTA 30
||||||||||||||||||||||||||||||
Sbjct 7937684 TCAATTTTGAAATCTGCAAAGTATTCATTA 7937713
Score = 56.5 bits (30), Expect = 2e-07
Identities = 30/30 (100%), Gaps = 0/30 (0%)
Strand=Plus/Plus
Query 1 TCAATTTTGAAATCTGCAAAGTATTCATTA 30
||||||||||||||||||||||||||||||
Sbjct 8007698 TCAATTTTGAAATCTGCAAAGTATTCATTA 8007727
Lambda K H
1.33 0.621 1.12
Gapped
Lambda K H
1.28 0.460 0.850
Effective search space used: 17500016322
Database: scaffolds_FINAL.fasta
Posted date: Nov 29, 2022 1:21 PM
Number of letters in database: 2,916,671,763
Number of sequences in database: 99
Matrix: blastn matrix 1 -2
Gap Penalties: Existence: 0, Extension: 2.5
first few lines of 1st alignment:
Read name = scaffold_8
Read length = 19,508bp
Flags = 2048
----------------------
Mapping = Supplementary @ MAPQ 60
Reference span = chr11:128,797,610-128,817,807 (+) = 20,198bp
Cigar = 7922472H5518M1D5597M1D4095M693D979M5I3314M133939136H
Clipping = Left 7,922,472 hard; Right 133,939,136 hard
----------------------
SupplementaryAlignments
chr11:28,991,342-34,686,692 (-) = 5,695,350bp @MAPQ 60 NM16125
chr11:122,017,398-126,863,398 (-) = 4,846,000bp @MAPQ 60 NM23032
2nd alignment:
Read name = scaffold_8
Read length = 70,350bp
Flags = 2048
----------------------
Mapping = Supplementary @ MAPQ 60
Reference span = chr11:128,797,772-128,868,092 (+) = 70,321bp
Cigar = 7991953H30938M1I266M1I222M1D6885M10I13988M1I6162M16I7841M1I4018M133818813H
Clipping = Left 7,991,953 hard; Right 133,818,813 hard
----------------------
SupplementaryAlignments
chr11:28,991,342-34,686,692 (-) = 5,695,350bp @MAPQ 60 NM16125
chr11:122,017,398-126,863,398 (-) = 4,846,000bp @MAPQ 60 NM23032
3rd alignment:
Read name = scaffold_8
Read length = 53,770bp
Flags = 2048
----------------------
Mapping = Supplementary @ MAPQ 60
Reference span = chr11:128,813,516-128,867,266 (+) = 53,751bp
Cigar = 7937683H978M5I3600M3D1673M4D6130M11I2806M1I488M1D1197M1I5688M11I2540M3I9687M1D13790M1I1973M6D484M1I2702M133889663H
Clipping = Left 7,937,683 hard; Right 133,889,663 hard
----------------------
SupplementaryAlignments
chr11:28,991,342-34,686,692 (-) = 5,695,350bp @MAPQ 60 NM16125
chr11:122,017,398-126,863,398 (-) = 4,846,000bp @MAPQ 60 NM23032