Extract the mapped contig sequences from SAM/BAM file
1
0
Entering edit mode
7.7 years ago
BDK_compbio ▴ 140

I have a sam/bam file that contains the mapping of long reads with assembled contigs from short reads. Because of low coverage of long reads, I have 80% of the contigs that are not mapped to s single read. I would like to extract the contigs (with nucleotide sequence) that have at least one mapped reads. How could I do it quickly using my SAM/BAM files? Also, I would like to extract the sequences of a particular contiguous and its mapped long read sequences. Any help would be appreciated.

samtools bwa BAM SAM • 6.0k views
ADD COMMENT
2
Entering edit mode
7.7 years ago
vmicrobio ▴ 290

you may extract your mapped reads using samtools then bamtools to get a fastq and a sed to get your result in fasta format

samtools view -F4 -b in.bam > mapped-out.bam

bamtools convert -in mapped-out.bam -format fastq > mapped-out.fastq

sed -n '1~4s/^@/>/p;2~4p' mapped-out.fastq > mapped-out.fasta
ADD COMMENT

Login before adding your answer.

Traffic: 1804 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6