Hi,
I'm having problems with a stranded, paired end RNA-seq analysis. I tried to map the reads with HISAT2 and RNA STAR but when I visualize the alignment it seems unstranded. If it helps, here the parameter I used with HISAT2:
Tool Parameters
Input Parameter Value Note for rerun
Source for the reference genome indexed
Select a reference genome hg38
Is this a single or paired library paired
FASTA/Q file #1 80: Rep1_1.fq
FASTA/Q file #2 81: Rep1_2.fq
Specify strand information Reverse (RF)
Paired-end options defaults
sum
Output alignment summary in a more machine-friendly style. False
Print alignment summary to a file. False
adv
Input options defaults
Alignment options defaults
Scoring options defaults
Spliced alignment options defaults
Reporting options defaults
Output options defaults
Other options defaults
Job Resource Parameters no
In order to understand more what is going on I tried to run "infer experiment' on the bam files and it gives this kind of result...
This is PairEnd Data
Fraction of reads failed to determine: 0.9590
Fraction of reads explained by "1++,1--,2+-,2-+": 0.0045
Fraction of reads explained by "1+-,1-+,2++,2--": 0.0365
Does anyone have an idea why this is happening?
Thank you very much
Hi, I recommend to run a proper QC analysis including the output of FastQC, STAR or Hisat, and samtools flagstat, mapping rates, all parameters etc. Also, I would recommend to simply get counts from FeatureCounts with all settings for -s [0|1|2] and report. I don't know about about "infer experiment', but the output looks suspicious. In principle, even if the protocol was unstranded, wouldn't one expect that around 50% of pairs could be explained by either model just by chance? So I think something else went wrong:
Salmon also can automatically detect strandedness, you can try it.