Hi guys, I'm a newbie in the RNA Seq world. I'm analyzing some RNA seq data (HiSeq) and doing this, I'm following the best practice guideline available online form different sources. Briefly I performed the alignment to the reference genome, the merge, I removed duplicates, then I performed the sorting, indexing and finally using bedtools multicov I ended up having the reads count per RefSeq. My point is that I would like to have a gene level expression measure but after the counts I have reads counts per transcript variants. in other words I have multiple transcripts per gene with corresponding reads count but I would like to have one reads counts per gene. How do you deal with this issue?
Thanks in advance
B.
What do you want to use the counts for? Normally you'd use featureCounts for that (and not remove duplicates, which is generally a bad idea in RNAseq).
I only would like to know if the genes are differentially expressed. I don't want to know if a specific splicing variant is differentially expressed. Only the gene.
As Devon wrote, use featureCounts (or htseq-count or ...), which you can use to count per gene.