Hello,
I observed a high percentage of "no features" while running HTseq w/ the --stranded yes option enabled (>80%). The library prep kit I am using is Illumina TruSeq RNA Exome which generates stranded data. If I run HTseq-count w/ strand == "no" or "reverse" I observe expected levels of "no features" <2%.
I have checked for compatibility between reference genome and gtf coordinates (chr1 vs 1) and mapped to a second reference genome to assure gtf ranges are concordant.
To my knowledge this library prep kit does not use dUTP.
CFLAR (+ strand) IGV export seen below.
I came across this paper describing the best option to call based on strand synthesis. The Illumina guide describes the molecular reactions as,
The RNA is fragmented using divalent cations under elevated temperature. cDNA is generated from the
cleaved RNA fragments using random priming during first and second strand synthesis. Then, sequencing
adapters are ligated to the resulting double-stranded cDNA fragments. The coding regions of the
transcriptome are captured from this library using sequence-specific probes to create the final library.
Can you please advise on the best --stranded option for this chemistry? Thank you.
-T