Entering edit mode
10.1 years ago
joejoe
•
0
Hi all,
I have some RNAseq files of formalin fixed paraffin embedded (FFPE) samples ran on a MiSeq and NextSeq machine. When I look at the raw files the read length distribution looks like shown in the table below.
length S1 S2
76 4,8 29,6
75 1,1 16,0
74 0,8 4,2
73 0,2 1,8
72 0,1 0,8
71 0,1 0,6
… … …
36 0,1 0,2
35 0,2 0,6
34 0,1 0,2
33 0,1 0,2
32 89,0 32,8
Has anyone seen this before? Is it due to the sample prep, library prep, or the base calling in the machine?
Thx, Joejoe
FFPE samples will contain degraded RNA, so it is not unlikely that the may contain many short sequences. The patterns should be somewhat similar to ancient DNA patterns, so maybe assessing these using the mapDamage tool could be useful.
https://github.com/ginolhac/mapDamage (mapDamage)
http://www.ncbi.nlm.nih.gov/pubmed/22643842 (Paper containing some description of the damage you would see)