Question

Low coverage for bases near the end of target reference sequence

0

Entering edit mode

6.3 years ago

kspata ▴ 90

Hi All,

I have viral samples sequenced using NextSeq PE 150 with over 90 million reads, I aligned these reads after trimming adapters and low-quality bases to a reference sequence (around 2750bp length). I noticed the bases at the end of the target sequence (2735bp - 2750bp) have very less coverage (equal to 0X or less than 10X) while the average per base coverage is 16700X, I used bowtie2 local alignment with default parameters.

Why am I getting low coverage near the end of target sequences? How can I improve coverage for this region?

Thanks in advance!!!

alignment bowtie2 sequencing • 1.4k views

ADD COMMENT • link 6.3 years ago by kspata ▴ 90

0

Entering edit mode

Is this a RNA virus? Did you start sample prep from RNA?

ADD REPLY • link 6.3 years ago by h.mon 35k

0

Entering edit mode

I believe it is a DNA virus as sample prep was done using complete DNA purification kit.

ADD REPLY • link 6.3 years ago by kspata ▴ 90

0

Entering edit mode

~~Is the drop in coverage sudden or smooth?~~

Never mind that, if it is just the last 15 bases, it is a sudden drop. In addition to genomax suggestions, it could also be an artifact. Are there similar viruses with the genome sequenced? Did you blast your whole virus sequence on NCBI?

ADD REPLY • link 6.3 years ago by h.mon 35k

0

Entering edit mode

Two things to consider:

This may be a problem with aligners not being able to map reads (which may be much longer) to last 15 bp of the reference.
You may have discarded reads with the last 15 bases during your trim/cleaning process.

ADD REPLY • link 6.3 years ago by GenoMax 147k

0

Entering edit mode

Thank you for response. I did a two level trimming (trim_galore and sickle) which resulted in loss of reads aligning at the end of target sequence. I re-trimmed the raw reads with trim_galore only and got less than 10X coverage at ends of the target sequences. The coverage analysis still shows 0X coverage at base position 2749 -2750.

How can I align partial reads only to the ends of target reference sequence to get more than 0X coverage? Will this approach work and how can I use bowtie2 to do this?

ADD REPLY • link 6.3 years ago by kspata ▴ 90

0

Entering edit mode

Is the genome of your virus circular or linear?

How are you looking at the coverage?

ADD REPLY • link 6.3 years ago by h.mon 35k

0

Entering edit mode

kspata : Have you considered the possibility that your viral strain has a small deletion at those two base pairs? So what you are seeing is real for your strain.

ADD REPLY • link 6.3 years ago by GenoMax 147k