How to extract unmapped reads if you have paired end reads (fastq) and mapped BAM file?
1
0
Entering edit mode
2.1 years ago
O.rka ▴ 740

I have the following files:

  • sampleA_1.fastq.gz - Forward reads
  • sampleA_2.fastq.gz - Reverse reads
  • mapped.sorted.bam - Sorted BAM file of mapped reads (doesn't contain unmapped reads)

How can I use these 3 files to get the following files:

  • sampleA_1.unmapped.fastq.gz
  • sampleA_2.unmapped.fastq.gz

Can I use anything from the following tools I have installed in my current environment?

  • bbmap (and all the software in this suite)
  • samtools
  • seqkit
fastq bam sam reads • 1.6k views
ADD COMMENT
3
Entering edit mode
2.1 years ago
GenoMax 148k

If your BAM file contains the unmapped reads (not all aligners include them) then you can use bbmap (reformat.sh) and samtools to get the unmapped reads (search for samtools threads here or look at inline help for reformat.sh).

If your BAM file does not contain unmapped reads then you will need to get the read headers from your BAM (mapped reads) and then use filterbyname.sh to get the unmapped reads from your original data files.

ADD COMMENT
0
Entering edit mode

Damn, I love bbtools. They always have whatever weird processing I'm trying to do already baked in. Thanks for responding.

ADD REPLY
0
Entering edit mode

When you create the names files (one header per line) be sure to remove the @ symbol at beginning of fastq header.

ADD REPLY
0
Entering edit mode

Yea, I used samtools view mapped.sorted.bam | cut -f1 | sort -u > reads.mapped.list and it worked like a charm.

ADD REPLY

Login before adding your answer.

Traffic: 2229 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6