I want to extract transcriptomic reads from BAM file, I do this using featureCounts software with Gencode GTF annotation, which include annotation from both ENSEMBL and HAVANA. Will this bias my results? For example, HIST1H2BK has two exons annotated inENSEMBL and one exon annotated in HAVANA as following: 1) chr6 ENSEMBL exon 27114188 27114619 2) chr6 ENSEMBL exon 27106073 27106460 3) chr6 HAVANA exon 27114197 27114577 exon1 and exon3 overlapping a lot, which could be the same exon in different annotation databases. featureCounts counted reads located all 3 exons, which mean the overlapped exon counted twice:
HIST1H2BK chr6;chr6;chr6 27106073;27114188;27114197 27106460;27114619;27114577 -;-;-
Will this bias the results? Should I just use one of ENSEMBL and HAVANA?
Thanks! Leo
"Overlap between reads and features" section at webpage: http://bioinf.wehi.edu.au/featureCounts/