Extract soft-clipped reads from BWA -generated SAM file
0
0
Entering edit mode
8.3 years ago
ThePresident ▴ 180

Hi,

I aligned Illumina-generated, paired-end reads to reference sequence using bwa-mem. Now, I would like to extract soft-clipped reads in order to determine boundaries for some potential structural variants. I used samblaster tool with the following command:

bwa mem index R1.fastq R2.fastq | samblaster -u clipped.sam | samtools view -Sb - > clipped.bam | samtools sort - clipped_sorted.bam

However, I get a lot of supposedly soft clipped reads. I wasn't expecting a huge proportion of soft clipping to occur unless this is something common with bwa mem?

Here the output of my clipped SAM file:

    @D00780:28:HFJLLBCXX:1:1103:1175:2055_1
GCCATCTTTTCACTACTTGCTCCATATTTTTTGTCTGATTCGGTTGTGTTACTTGAAATGGCATTTGAGTAGTGAATACTTGGGTAGTCGATTCCTAGACCATTTAGGCTGTCTCTTATACACATCTCCGAGCCCACGAGACTCCTGAGCA
+
DD@DDDHHIIIHIIEHHIIIHFHIIIIGHIIGHIIIIGHHHIHHIHHHHIIIIIFGHHIHHIIHHEIHHIIIIIIIIHGGIHIFHHDHIIHIHIIIIHIGIDGHHIHHHEHHIHHIGHHHHHIIIIIIIDDHHHHHIHIHHGHIIIHHFHC
@D00780:28:HFJLLBCXX:1:1103:1175:2055_2
ACGCGTAAGATCGTCGGCAGCGTCAGATGTGTATAAGAGACAGGCCATCTTTTCACTACTTGCTCCATATTTTTTGTCTGATTCGGTTGTGTTACTTGAAATGGCATTTGAGTAGTGAATACTTGGGTAGTCGATTCCTAGACCATTTAGG
+
.CHEHF@<D<GHGIIIIHIIIIIIHIHIIIIIHHGHHHHHHHHDEF@EC<IIHHDIHHIIIHHHHHHIIIIIIIIIHIIHHIHIIHIIIIGIIIIIIIHIIIIIIHIIIHGIHHIIIIIHHHIHIIIIIIHHIIHHEIHIIHGHHIDDDDD
@D00780:28:HFJLLBCXX:1:1103:1229:2156_1
soft-clipping • 3.9k views
ADD COMMENT
3
Entering edit mode

That's the FASTQ, not SAM.

ADD REPLY

Login before adding your answer.

Traffic: 2520 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6