Do I have to remove overlapping reads from paire_end data before Metahplan?
0
0
Entering edit mode
24 days ago

I have paired_end files from shoutgun metagenomics analysis (251 bp). Before starting with Metaphan, I run fastqc and fastq_screen to check how my files are. I used KneadData to delete the human genome, and now it is ok. (I also notice that all my files do not pass the “Per Base Sequence Content.” Is this a problem? (All the other control is OK.)

Should I also have to delete overlapping reads between R1 and R2? How can I do it?

Thanks Michela

overlapping paired_ends metaphaln • 309 views
ADD COMMENT
0
Entering edit mode

For the “Per Base Sequence Content”, is really hard to pinpoint what the actual problem is without looking at the graph itself. Is it localised within certain region of the read (leading or trailing)? Also, I don't think it is harmful to keep overlapping R1/R2, it just tells you that your fragments are rather short.

ADD REPLY
0
Entering edit mode

Thank you for the answer! The problem is always located in the first 10/15 bp

enter image description here

ADD REPLY
1
Entering edit mode

Please see: https://sequencing.qcfail.com/articles/positional-sequence-bias-in-random-primed-libraries/

That bias in first 10-15 bases is not a problem in terms of alignments etc. "Failing" a test in FastQC is not always a deal breaker. You need to consider the results in context of your experiment.

ADD REPLY

Login before adding your answer.

Traffic: 1786 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6