Good evening,
The multiqc
report shows the presence of poly- tails in my samples after DNA (!) sequencing even after cleaning with fastp
and cutadapt
with appropriate settings. I think these are sequencing artifacts and I want to find reads with such "polyA-tails" in fastq.gz
files.
Could you help me, please: how long should the sequence AAA... at the end ot read be so that fastqc
and then multiqc
consider it as a polyA-tail? This is necessary for search.
Thank you in advance, Poecile
P.S. By the way, these polyA-tails appear on adapter charts only after combining samples using multiqc
, they aren't exist on fastqc
charts for individual samples.
If this is DNA sequence then there should be no poly-A's in the sense of poly-A tails.
Use
bbduk.sh
in filter mode withliteral=AAAAAA
(adjust the length as needed). If you need the poly-A's to be only in 3'-end of the reads then addrestrictright=NN
parameter.Please show an example plot/table.