Coverage drop at assembly ends
2
0
Entering edit mode
5 days ago
alenew.am ▴ 10

hello everyone, to test some alignment tools, I have loaded a complete phage genome (single contig) into galaxy, then I used the ART ILLUMINA tool to generate paired end reads from this genome (20x coverage, 150 bp paired ends, DNA fragment size 600). I used the two files obtained (forward and reverse) to align them with various tools (bwa-mem, minimap2, bowtie2) on the genome from which they were obtained. The results obtained are practically identical (BAM file of the alignment visualized with qualimap). I attach two of the results. The thing that leaves me perplexed is that in all three alignments at the assembly ends there is a strong drop in the coverage, despite the fact that the reads were generated directly on the assembly (and therefore in theory free from bias due to the fact that phage genomes being linear produce, with a library based on transposons, a drop in the coverage) at the ends. Which is the reason of this phenomenon? And how can i deal with it? Thanks for the replies!

enter image description here enter image description here

bowtie2 bw-mem alignment minimap2 • 361 views
ADD COMMENT
2
Entering edit mode
5 days ago
Buffo ★ 2.4k

I'd be surprised if the coverage was uniform at both ends using paired-end reads. Why would you expect uniform coverage at the ends if the reads are randomly created? What would you expect to happen with the mapping at the ends?. I'd say an increase of non-concordantly paired end mapping and soft-clipping alignment at the ends is perfectly normal, and there is nothing special/unexpected here.

ADD COMMENT
0
Entering edit mode

Thank you for the explanation!

ADD REPLY
1
Entering edit mode
5 days ago
GenoMax 150k

Creating an artificial "chromosome" that has the data from the "ends" of the genome. (+ a couple of kb of sequence on each side) would allow the read simulator to create reads that span the ends (when you initially created the reads). This only makes sense if your genome is circular (e.g. a bacterial genome). If not this is what you will get.

ADD COMMENT
0
Entering edit mode

Thank you! In case of a circular genome i will do as you said!

ADD REPLY

Login before adding your answer.

Traffic: 2068 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6