Hello,
I am checking the strandness of a sample using infer_experiment.py and I got a pretty confusing result (for me).
infer_experiment.py -i sample_Aligned.sortedByCoord.out.bam -r gencode.v36.annotation.gene.bed
Results in:
This is PairEnd Data
Fraction of reads failed to determine: 0.4848
Fraction of reads explained by "1++,1--,2+-,2-+": 0.0122
Fraction of reads explained by "1+-,1-+,2++,2--": 0.5030
So half fails to be determined and half seems to be strand-specific.
I know where this sample comes from, so I know that it's a strand-specific library. My question is why do I have half of the reads undetermined?
Any ideas?
Hi, I have similar problems. Did you figure out why this happened? Thanks for your help.
I think it may be related with the no. of reads mapped. Which was the percentage of uniquely mapped reads with STAR?
Also,
infer_experiment.py
by default samples a fraction of your total number of reads. If I'm not wrong is 200000 reads.