Hi Everyone
Is anyone able to assist with the cause of something unusual I have noted on a FastQC output? I have scoured the internet but I have not been able to find a definitive answer on the cause.
I have 150bp illumina Miseq generated reads enriched for a DNA virus (the GC content is expected to be 56%). There is this strange V for the per base sequence content at the end of the reads which does not change when reads are trimmed using trimmomatic. For this particular output file the adaptors have been been trimmed and I have carried out some minor read trimming (SW 4:15 and minimum length 40). At first I suspected adaptors were to blame but it remains even after their removal. It is obviously some sort of bias but shouldn't trimming have minimised the risk of this? Has anyone seen anything like this before and could point me in the right direction please?
Thanks
I've seen this happening often before but than only for the last base.
Can you post the same plot but then don't do the binning (so on a per-base resolution)?
Thanks for your reply. Sorry to have to ask but please could you point me in the direction of instructions on how to generate this with a per base resolution. I am using MacOS and have access to the GUI and command line versions of FastQC. Thanks again
sure: add the option
--nogroup
to your command line.Thanks so much
Here you go - it looks like it is only the end base actually as you suspected. Is this likely to be an adaptor artefact still?