Hi guys, I'm trying to move from HG19 to HG38. While running FeatureCounts using Refseq*.gtf I see some differences between HG19 and HG38. This I suppose is expected but I only would like to know if I'm doing wrong since I'm a newbie in the RNA-Seq HG38 analysis. Here some outputs of FeatureCounts for one sample (single end) as example:
Sample1.
HG19:
Features : 460394
Meta-features : 24103
Chromosomes/contigs : 49
Total reads : 24944206.
Successfully assigned reads : 11116093 (44.6%)
The same sample with HG38:
Features : 536042
Meta-features : 50502
Chromosomes/contigs : 256
Total reads : 27605790
Successfully assigned reads : 6766947 (24.5%)
The alignment was performed using STAR and in both cases the Uniquely mapped reads % was around 93% or in any case it was comparable between the HG19 and HG38. Moreover, FeatureCounts was used as follows:
Fcounts <- featureCounts(files=fileName,
annot.ext="/.../hg38.refseq.gtf",
isGTFAnnotationFile=TRUE, GTF.attrType="gene_id",
GTF.featureType="exon", nthreads=4, minMQS=3, isPairedEnd=FALSE)
Can anyone give me some feedback?
Thank you in advance
I changed your post from Job to Question.
At least if you run
featureCounts
on the command line you get a file with summary metrics, if R produces that as well then have a look at it. If R doesn't produce that file then run featureCounts directly rather than via the R wrapper.the stats are in
Fcounts$stat
How the number of reads is different, if it is the same sample?
And
Successfully assigned reads is different