Find breakpoints using long reads
1
0
Entering edit mode
2.1 years ago

Hello everyone! I want to determine the precise positions of breakpoints in sp1 (assembled species). I have a number of long nanopore .fastq reads from sp2 (unassembled species). The species sp1 and sp2 are closely related. I am aware of the breakpoints' approximative coordinates (coord2-coord1 ≈ 1Mb, coord4-coord3 ≈ 1Mb). (View the image.)

I adopted the following strategy: I cut left and right regions and aligned to these .fasta files long nanopore reads separately. I thought that there should only have been a few long reads that both alignments shared. And how I believed that there are breakpoints in these reads. But I discovered that these files have about 40k common reads.

Maybe someone has a better idea (tools) or could improve mine! I appreciate it.

enter image description here

alignment nanopore samtools • 679 views
ADD COMMENT
0
Entering edit mode
2.1 years ago
shelkmike ★ 1.4k

If I understand correctly, you want to do what is called "structural variant calling". You can align reads of sp2 to the genome of sp1 and then use Sniffles (https://github.com/fritzsedlazeck/Sniffles). Sniffles will give you a list of structural differences between these two genomes.

An even more simple strategy is to align reads and then visualise the alignment in a program like Tablet (https://ics.hutton.ac.uk/tablet/). You will observe the breakpoints by eye.

ADD COMMENT

Login before adding your answer.

Traffic: 1792 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6