I am developing a new NGS amplicon-based test.
I need to mark the primers and I am using Samtools ampliconclip tool. It is working ok but I have an issue when using the -strand option.
The following image shows amplicons overolapping one exon of the CEBPA gene
Yellow squares are my primers soft-clipped properly. The two red squares are the problem. I do not want to soft-clip those two and I am not sure why this is happening because I am using the option --strand. See the full code and bed file below.
apptainer exec "$samtoolspicardbwa" samtools ampliconclip \
--both-ends \
--soft-clip \
--fail \
--strand \
-b testing_overlapping.bed \
RACP1-4poolv7anddirect-A_mapped_sorted_IRealigned.bam > with_strand.bam
chr19 33301308 33301331 CEBPA_05_F_P5 . +
chr19 33301557 33301577 CEBPA_04_F_P5 . +
chr19 33301579 33301599 CEBPA_05_R_P5 . -
chr19 33301648 33301668 CEBPA_03_F_P5 . +
chr19 33301785 33301805 CEBPA_04_R_P5 . -
chr19 33301899 33301919 CEBPA_02_F_P5 . +
chr19 33301927 33301946 CEBPA_03_R_P5 . -
chr19 33302153 33302173 CEBPA_01_F_P5 . +
chr19 33302175 33302195 CEBPA_02_R_P5 . -
chr19 33302420 33302440 CEBPA_01_R_P5 . -
My question is: Is this output okay or I am doing something wrong or I am misunderstanding something? I thought that the option --strand would only allow soft clip when the strand of the reads matches the strand of the soft clipped region in my bed file. Is this correct?? This would avoid the unwanted soft-clipping that I have in those two reads (marked with the red square).
I have added an image from IGV marking the region I do not want to soft-clipped (the 1786-1798 -)
If I remove the --strand option the result is exactly the same.
EDIT:
I have filtered out forward and reverse reads in different files, then I have soft clipped the reads and I have merged the two new files in one and I have got the desired output
Can this be done in one step with Samtools Soft-clipped?
Thank you so much. I will try this and I will come back with results.