Hello Biostars community,
A question regarding the effect of the fragment length distribution on Salmon's EffectiveLength computation, for TPM, based on our below situation.
- 150 bp sequencing
- paired end library
- 90 percent of reads on genes are overlapping
- negative inner distance reported by RSeQC (-150bp to -100bp)
In most sequencing libraries, fragmentation and sequencing are set to avoid overlapped PE reads, whereas ours is obviously not.
How do these 2 different scenarios above, affect the fragment length distribution calculations (assuming its impossible to calculate insert size when paired end reads do not overlap in an RNA library).
Thanks in advance, Chris
If you have a reference available then paired-end sequencing allows one to estimate the length of the library fragment being sequenced by inferring how far apart the two reads map/align on the reference. One of the exceptions would be, if the library fragment captured a breakpoint (e.g. two ends map to two different chromosomes). In that case it is not possible to estimate insert size.
You have a "short" insert library. There is no solution for this specific issue except making a new prep/library.