Filter out Bam not overlapping with Bed File?
1
0
Entering edit mode
2.1 years ago
Eliveri ▴ 350

I have a Bam file for which I want to REMOVE reads overlapping with a given Bed file (so keep all other reads/alignments).

I know the parameter option -L = only output alignments overlapping the input bed file, however I want to output alignments NOT overlapping with the input bed file.

Is this possible?

I tried the following with the -U parameters but it has not worked. I am currently trying samtools view

samtools view -b -h original.bam -T ref_dir/genome.fa -U ref_dir/human.bed > new.bam

samtools view documentation: http://www.htslib.org/doc/samtools-view.html

bed bam samtools • 717 views
ADD COMMENT
3
Entering edit mode
2.1 years ago
samtools view -L in.bed -O BAM --unoutput output.bam input.bam > /dev/null
ADD COMMENT
0
Entering edit mode

Is there any way to also include unmapped mates of mapped reads and vice versa? So that singleton reads will not be discarded because of missing mates? Thank you.

After I obtained the output.bam. I sorted the output.bam and convert to fastq (example below):

However 663286 singletons were discarded

samtools sort -n -m 5G -@ 2 output.bam -o output.sorted.bam

samtools fastq -@ 8 output.sorted.bam \
  -1 output_R1.fastq.gz \
  -2 output_R2.fastq.gz \
  -0 /dev/null -s /dev/null -n

output.bam info:

> samtools flagstat output.bam
38099031 + 0 in total (QC-passed reads + QC-failed reads)
35045458 + 0 primary
3053573 + 0 secondary
0 + 0 supplementary
0 + 0 duplicates
0 + 0 primary duplicates
38048117 + 0 mapped (99.87% : N/A)
34994544 + 0 primary mapped (99.85% : N/A)
35045458 + 0 paired in sequencing
17514426 + 0 read1
17531032 + 0 read2
28367540 + 0 properly paired (80.94% : N/A)
34971093 + 0 with itself and mate mapped
23451 + 0 singletons (0.07% : N/A)
3005801 + 0 with mate mapped to a different chr
2859743 + 0 with mate mapped to a different chr (mapQ>=5)
ADD REPLY

Login before adding your answer.

Traffic: 1729 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6