filtering bam file
1
0
Entering edit mode
9.1 years ago
schelarina ▴ 50

Hello,

I want to filter a bam file based on the number of reads mapped to each position in the genome.

For example I want to keep only the positions that are covered by >=40 stacked reads (with different length) but starting at the same position

Thanks

RNA-Seq next-gen alignment • 2.5k views
ADD COMMENT
0
Entering edit mode

I tried

samtools view -h -b file.bam | awk '{a[$4]++}END{for (i in a)if (a[I]>=40)print $0;}' > file.bam
ADD REPLY
0
Entering edit mode
9.1 years ago
michael.ante ★ 3.9k

Hi,

you can use the bedTools' genomeCoverageBed bamToBed and mergeBed to get all start-positions' coverage; filter all positions with coverage >= 40, and build a bed file. This bed-file can be used as input for samtools view (-L bedfile) intersectBed with the first bed-file and extract the read-names.

Cheers,

Michael

ADD COMMENT

Login before adding your answer.

Traffic: 1993 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6