Hi
I have used featureCounts to generate WTS (data was PE WTS) count file using aligned sorted bam files. featureCounts output looks like this:
Program:featureCounts v1.6.3; Command:"featureCounts" "-T" "4" "-s" "2" "-a" "/Tools/hg38.refGene.gtf" "-o" "6_aligned_sorted_duprm.bam"
# Geneid Chr Start End Strand Length 6_aligned_sorted_duprm.bam
DDX11L1 chr1;chr1;chr1 11874;12613;13221 12227;12721;14409 +;+;+ 1652 0
WASH7P chr1;chr1;chr1;chr1;chr1;chr1;chr1;chr1;chr1;chr1;chr1 14362;14970;15796;16607;16858;17233;17606;17915;18268;24738;29321 14829;15038;15947;16765;17055;17368;17742;18061;18366;24891;29370 -;-;-;-;-;-;-;-;-;-;- 1769 706
MIR6859-1 chr1;chr1;chr16;chr15 17369;187891;17052;101973524 17436;187958;17119;101973591 -;-;-;+ 272 0
MIR1302-11 chr1;chr19;chr9;chr15 30366;71973;30144;101960459 30503;72110;30281;101960596 +;+;+;- 552 0
FAM138A chr1;chr1;chr1;chr19;chr19;chr19;chr9;chr9;chr9 34611;35277;35721;76220;76886;77330;34394;35060;35504 35174;35481;36081;76783;77090;77690;34957;35264;35864 -;-;-;-;-;-;-;-;- 3390 0
You can see that for a single gene the chr position is showing different chromosomal locations. Why the is happening? Is this a fault while running featureCounts? How to solve this?
Thank you.
Regards,
Tanay
Hi GenoMax,
Thank you for looking into this. I wanted to ask that if you look into MIR6859-1, for example, then it is showing chr1;chr16;chr15 in the chromosome number column. How is this possible?
I have also used fastp to generate deduplicated raw fastq files. Should this be done? Also, is it important to count multimapping reads for featureCounts?
Regards, Tanay
Since microRNA's are short it may be annotated to multiple locations based on the sequence. Your annotation file must have it annotated as such.
You should not deduplicate raw fastq files for RNAseq data before alignment. People generally ignore multi-mapped reads. If you wish you use them then use a tool like
salmon
that will use statistical modeling to distribute multimapped reads.