Subset Bam File According To List Of Contigs
1
2
Entering edit mode
11.1 years ago

Dear Biostar community,

I have a bwa mapping of metagenomic sequencing reads on assembled contigs. I also have a list of IDs of "interesting" contigs. Now I would like to subset the mapping (bam/sam file) to obtain a mapping (bam/sam file) that only contains my desired contigs. Do you know of any simpel way (samtools, picard tools, R, Scala/Java) to do so? Note please that creating a mapping with a reduced contig.fasta file is not an option for me.

Thank you in advance!

sam bam filter • 17k views
ADD COMMENT
9
Entering edit mode
11.1 years ago

samtools view -bh input.bam (should be sorted) contig_of_interest > output.bam

For example: samtools view -bh input.sorted.bam chr1 > output.bam . In case your contig of interest is named as chr1.

if you have more than one contig of interest then go for:

samtools view -h input.sorted.bam | awk '{if($3 == "chr1" || $3 == "chr2"){print $0}}' | samtools view -Sb - > output.bam. In case your contigs of interest are chr1 and chr2.

ADD COMMENT
5
Entering edit mode

Thank you both for your answers. :) May I add that you can just append the regions: So one could do samtools view -bh input.sorted.bam chr1 chr2 > output.bam or samtools view -bh input.sorted.bam cat list_of_contigs.txt | tr "\n" " " > output.bam

ADD REPLY
0
Entering edit mode

Thanks for the additional information. I didn't know that.

ADD REPLY
2
Entering edit mode

I know it's old but for new arrivers this is what worked for me : cat chromosome.list | tr "\n" " " | xargs samtools view -bh input.bam > output.bam

ADD REPLY
0
Entering edit mode

If you havent converted your sam files to bam then use :

egrep "^@HD|^@RG|^@PG|chr1|chr2" input.sam > output.sam (filtered)

ADD REPLY

Login before adding your answer.

Traffic: 2071 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6