Question

Single end RNAseq data strandedness infer-experiment-py

0

Entering edit mode

6.3 years ago

nilus1432 ▴ 30

Hi all,

I have a single end RNAseq data and would like to understand the strandedness of the data.

From wet-lab input I know "Stranded cDNA library was generated by reverse transcribing the RNA molecules".
I used infer-experiment-py The output is:

This is SingleEnd Data
Fraction of reads failed to determine: 0.0795
Fraction of reads explained by "++,--": 0.0703
Fraction of reads explained by "+-,-+": 0.8502

So it's stranded but is it forward or reverse? I do not understand the help given here:

Does it means that its reverse stranded and I have to use s -2 option in featureCounts, "reverse" strandedness in htseq, --rf in StringTie?

Thank you

rna-seq • 3.0k views

ADD COMMENT • link updated 21 months ago by DareDevil ★ 4.4k • written 6.3 years ago by nilus1432 ▴ 30

score 2 · Answer 1 · 2023-07-06

Based on the output provided by infer_experiment.py, the strandedness can be inferred as follows:

Fraction of reads failed to determine: 0.0795 This indicates the fraction of reads for which the strandedness could not be determined. It could be due to various reasons such as low-quality reads or issues with the library preparation. This value is relatively small, indicating that a majority of the reads could be properly assigned a strandedness.
Fraction of reads explained by "++,--": 0.0703 This fraction represents the reads that align in the forward orientation ("++") or the reverse orientation ("--"). In other words, the reads are mapping to the same strand as their corresponding reference transcripts. In this case, it suggests that 7.03% of the reads are aligned in the forward orientation and 7.03% are aligned in the reverse orientation.
Fraction of reads explained by "+-,-+": 0.8502 This fraction represents the reads that align in the forward-reverse ("+-") or reverse-forward ("-+") orientation. In other words, the reads are mapping to the opposite strand compared to their corresponding reference transcripts. In this case, it suggests that 85.02% of the reads are aligned in the forward-reverse orientation ("-+") and 85.02% are aligned in the reverse-forward orientation ("+-").

Based on this information, we can conclude that the library appears to be stranded in a reverse manner. The majority of the reads align to the reverse-complementary strand compared to their corresponding reference transcripts, while a smaller fraction aligns to the same strand.

score 1 · Answer 2 · 2019-01-05

The majority (85%) of your reads falls in the case described by:

+-,-+
read mapped to ‘+’ strand indicates parental gene on ‘-‘ strand
read mapped to ‘-‘ strand indicates parental gene on ‘+’ strand

So yes, your library is reverse stranded. As featureCounts is very fast, you can quickly confirm this by running it with both settings (-s 1 and -s 2), the correct setting will have a much lower count for NoFeature than the incorrect one.