Entering edit mode
5.0 years ago
cellulebioinfobiscem
▴
20
Hi, I do an alignement of RNA-seq publish data with STAR aligner. I see in fastqc that for some sample, N content increase to 12% at base 4 and 22 (warning in orange). Should i cut N part ?
Thanks
Looks like this data is not good, especially if you consistently have N's at specific cycles. Such data should not have been released by the sequencing facility. You can't remove N's since that will mess up reading frame.
You can filter out reads with N's using
reformat.sh
from BBMap suite by doing:or
(if reads are paired-end).
Thanks you for tools suggestion, I will test it.