Entering edit mode
3.1 years ago
BPors
▴
60
Hi,
I performed direct RNA Seq (with polyA
) to my short sequences (~90 bp
).
However, my fastq_pass are only 20% of the total reads.
I used this quality script to do analysis for the quality score: https://gist.github.com/wdecoster/7cad6080950fa1e3ae3eaeeac4f6ae4d#file-perbasesequencecontentandquality-py
Do you have experience/thoughts for why I am observing such fluctuating qualities through the read body?
I appreciate your feedback. Thanks!
I don't have to point this out for you, but your reads are very short. It may be that the basecaller isn't properly estimating the quality. I would suggest, if possible, aligning the data to a reference genome and looking at the percent identity of your reads.
Are you sure you set the right basecaller models for guppy? I messed this up once, and used the DNA model not knowing the reads were direct RNA, and got a load of rubbish out (my bad).