Hi there,
I have a contig that I’d like to remove from a bam file. I have tried removing it using picard and samtools but the darned thing is still in there.
This is what I’ve tried:
Step 1: create list of contigs I’d like to keep/remove
Keep:
samtools idxstats sampleID_markedup.bam | cut -f 1 | grep -v name_of_contig > toKeep.txt
Remove:
cat > offendingContig.txt
name_of_contig
Step 2: Remove offending contig
Picard
Method 1: keep all but offending contig
java -jar $picard FilterSamReads I=sampleID_markedup.bam O=sampleID_trim.bam READ_LIST_FILE=toKeep.txt FILTER=includeReadList
Method 2: explicitly exclude contig
java -jar $picard FilterSamReads I=sampleID_markedup.bam O=sampleID_trim.bam READ_LIST_FILE=offendingContig.txt FILTER=excludeReadList
Samtools
samtools view -b -R toKeep.txt sampleID_markedup.bam > sampleID_trim.bam
For each method I don’t get any error messages (exit status zero and nothing written to error log), and a new .bam file is generated, but the contig is always still in there, which I check by using:
samtools view sampleID_trim.bam name_of_contig | head -1
Is the fact that the .bam file has already been marked for duplicates a possible issue?
Any help would be greatly appreciated.
Kind regards
Hi Pierre, sorry for the late reply I've been off work for a few days.
Thank you so much, this works perfectly!
cool. Please, validate my answer (green tick on the left)
I tried to do that yesterday but the green tick appears to be missing!
enlarge your screen. https://imgur.com/8nvX23z
I've tried this using it, but I get an error saying that the command list is too long;
sample_4.sorted.bam is the name of the bam file from which i need to remove the contig named "prok_GCA_000243075.1_CP003171.1_from_11_to_747_total_737".
What am I doing wrong?