My RNAseq analysis pipeline is as follows: fastqc (read quality is good, some overrepresentation of adaptor sequence) -> trimmomatic (trimmed adaptor sequence, qc report after trimming suggests the overrepresented adaptors are gone) -> HISAT2 (used a forward stranded alignment protocol) -> feature counts
However, the strange thing is that if I use a forward stranded protocol in feature counts (with the forward stranded alignment I generated with HISAT2), the assignment rate is 5 percent, with the majority of unassigned reads being having no features. But if I use a reverse stranded protocol in feature counts, the assignment rate is 70 percent. Does this suggest my data is reverse stranded?
My library was prepared by the NEB directional RNA library prep kit that claims to produce forward stranded libraries, so I’m really confused by this result. How should I proceed with this data? Should I run an alignment with a reverse strand protocol and just proceed with a reverse strand assignment? Or should I proceed with the forward alignment and forward strand assignment data because that’s what the library is supposed to be?
Thank you so much for your help!
I'm having this exact same problem. PolyA RNA-Seq library, running with --fr in hisat2 gives a high fraction of concordant single-mappers, whereas --rf gives essentially none. When using -s 1 (forward stranded) though in featureCounts, I get essentially no assignment, while -s 2 (reverse stranded) gives a high assignment percentage. The descriptions of these parameters appear to suggest hisat2's --fr should be equivalent to featureCounts' -s 1, but the metrics indicate otherwise....
Make of it what you will, but there is this table online that also indicates what I've done should be right (--fr + -s 1), https://rnabio.org/module-09-appendix/0009/12/01/StrandSettings/.
Any suggestions would be greatly appreciated.