Entering edit mode
5.8 years ago
kintany
▴
10
Hi all, I'm analyzing RNA-seq data, paired-end, coming from Nextera. I ran Fastqc to look at the quality and got the following report: here. So I was concerned with the amount of Nextera transposase left, so I ran trim_galore to trim it. But after trimming the report looks like this: here. The transposase is gone but now the GC content plot looks super weird. Any thoughts? Thank you.
You may not have shared the files correctly since I can't see them in chrome.
Without seeing your data I am going to speculate that you may be encountering a well known observation of seeing a positional bias in first 10-15 bp of RNAseq datasets prepped with transposases based kits. In that case, you will find this blog post informative.
If this is not applicable then I suggest that you post screenshots of plots from FastQC using these directions: How to add images to a Biostars post
Thank you. I changed the format to pdf, you should the files now.
As expected that looked like a normal RNAseq dataset. Have you checked to see if you have
rRNA
contamination in your data? That may sometimes show up as the shouldered GC peak. As you did not have obvious adapter contamination in your data you could use the original data for alignment and let the aligner soft clip any parts that don't align.