Entering edit mode
6.3 years ago
kspata
▴
90
Hi All,
I have viral samples sequenced using NextSeq PE 150 with over 90 million reads, I aligned these reads after trimming adapters and low-quality bases to a reference sequence (around 2750bp length). I noticed the bases at the end of the target sequence (2735bp - 2750bp) have very less coverage (equal to 0X or less than 10X) while the average per base coverage is 16700X, I used bowtie2 local alignment with default parameters.
Why am I getting low coverage near the end of target sequences? How can I improve coverage for this region?
Thanks in advance!!!
Is this a RNA virus? Did you start sample prep from RNA?
I believe it is a DNA virus as sample prep was done using complete DNA purification kit.
Is the drop in coverage sudden or smooth?Never mind that, if it is just the last 15 bases, it is a sudden drop. In addition to genomax suggestions, it could also be an artifact. Are there similar viruses with the genome sequenced? Did you blast your whole virus sequence on NCBI?
Two things to consider:
Thank you for response. I did a two level trimming (trim_galore and sickle) which resulted in loss of reads aligning at the end of target sequence. I re-trimmed the raw reads with trim_galore only and got less than 10X coverage at ends of the target sequences. The coverage analysis still shows 0X coverage at base position 2749 -2750.
How can I align partial reads only to the ends of target reference sequence to get more than 0X coverage? Will this approach work and how can I use bowtie2 to do this?
Is the genome of your virus circular or linear?
How are you looking at the coverage?
kspata : Have you considered the possibility that your viral strain has a small deletion at those two base pairs? So what you are seeing is real for your strain.