I have the following BAM file and a annotation file. The BAM is a paired end alignment. What I want to do is to calculate reads (as fragment) count in I can do it with the following three methods.
- Method1: bedtools intersect
- Method2: bedtools multicov
- Method3: featureCounts.
Below is the code
#!/bin/bash
INPUT_BAM=my_paired_end_alignment.bam
ANNNOT_BED=peak_annot.bed
ANNNOT_GTF=peak_annot.gtf
# Temporary file
ALIGNMENT_BED=my_paired_end_alignment.bed
#-----------
# Method 1
#-----------
FINAL_COUNT_METHOD1=output_method1.txt
# convert BAM to BED
bedtools bamtobed -bedpe -i $INPUT_BAM | cut -f1,2,6,7,8,9 > $ALIGNMENT_BED
# Count fragments from BAM on ANNOT_BED
bedtools intersect -a $ANNOT_BED -b $ALIGNMENT_BED -c > $FINAL_COUNT_METHOD1
#-----------
# Method 2
#-----------
FINAL_COUNT_METHOD2=output_method2.txt
bedtools multicov -bams $INPUT_BAM -bed $ANNOT_BED > $FINAL_COUNT_METHOD2
#-----------
# Method 3
#-----------
FINAL_COUNT_METHOD3=output_method3.txt
featureCounts -a $ANNOT_GTF -o $FINAL_COUNT_METHOD3 $INPUT_BAM
The result given by Method1 is so different from Method2 (totally uncorrelated), whereas Method2 is highly correlated with Method3.
My question Is Method1 the right way to count feature? If not why? Should we use Method2 or Method3 instead?
Speed is less of a concern at the moment, but rather the correctness of Method1. I need to be convinced if Method1 is correct or not.
@ATPoint Thanks for your reply. Speed is less of a concern at the moment, but rather the correctness of Method1. I tried using
cut -f1,2,6,7,8,9
. But the Method1 is totally uncorrelated (different) from Method2 and Method3. I need to be convinced why Method1 is incorrect. Your enlightenment will be greatly appreciated.