Entering edit mode
21 months ago
oghzzang
▴
50
My data was obtained from very old FFPE samples. So, their insert size is 120 bp.. but we performed 151 bp paired-end sequencing. So our paired end reads are composed of overlapping paired reads.
In this situation, is it necessary to clip the merged regions using a tool like clipoverlap before running mutect2?
Also, if I want to identify integration sites (ex. chr1:1111-chr19:8888), should I use the merged regions be clipped?
Thanks.!
If you reads are longer than the insert then you will pick up adapter content once the read extends over the "biological" insert. That should manifest in fastqc as a warning and you can simply remove that with something like fastp or cutadapt. From there on there is imo nothing to be done specifically. Just align as always, then proceed with downstream analysis.
Thank you for reaching out to me. I used a tool "fastp" to remove adapter content. After removing the adapters, I mapped the data using Bwa, and observed the following phenomenon (red color reads with black border):
there is a possibility that the depth may appear to be doubled. don't I need to clip the overlapping regions?
Answer probably is depends on what you are trying to do. If you know that your reads overlap widely you could simply use one of the reads or merge the reads and use the merged representation. Mutect may recognize that the two reads come from same fragment though and account for that.
Thank you for reaching out to me. I want to find the point mutations with mutect2, and structural variations with delly2. And I executed mutect2 (v.4.1.9), but they couldn't recognize and deal with the overlapping paired reads. This is the example in my sample.
If they dealt with the overlapping paired end, they had to call the mutation with 2 altered alleles.
Many thanks.
Since you are sequencing the same fragment it should only be counted one time. You do not have two independent library fragments that cover that base.
you're right. All fragments of my data are too short, but I did 151bp x2 sequencing. I am looking for a recovery method.