Entering edit mode
10.5 years ago
cdwilliam524
▴
30
Could anyone tell me how to remove sequencing gaps from a *.sam file?
I use bwa to map the reference genome (fasta) to the metatranscriptome files (fastq), and then use "bwa sampe" to combine all the *.fasta
, *.sai
, and *.fastq
to get the *.sam
file. But the *.sam
file contains many gaps between contigs. How could I remove them?
Thanks in advance!
I assume you mean that you mapped the short reads (fastq) to the reference genome (fasta), rather than the other way around.
Can you elaborate a bit on what you mean by gaps in the SAM file? I can think of a couple possible ways to interpret that.
Hi Devon,
I am new to bioinformatics, so it should be in the way you stated. Sorry for the confusion.
So in the *.sam file I have sequence like NNNNNNNNNCNNNNNNNTNNNNNNAGNNNNCT... I assume Ns are sequencing gaps and I want to remove them and I want to know the start/ end positions of the mapped contigs (I am not sure if I worded it right).
Thanks