Hi everyone, hope everything is fine. I'm currently experiencing a problem with the MultiQC tool in terms of adapter content. It says on the report that I have 283 samples on the graph illustrating the adapter content, but I only have 100 samples in the other control parameters in the multiqc report.
not 100% familiar with the mutliQC details but is it possible that it gets a fastqc report per read? So for paired-end reads you will get two 'report files' for multiqc?
(even if so you still have a discrepancy of 83 :/ )
I have 50 samples with paired-end data, which gives me 100 (50x2=r1+r2). So I have one fastqc report per read (100 reports). But I can't understand why the report shows 283 samples instead of 100 like the other parameters in the report.
Exactly.
I tried to run the analysis only on the files obtained by fastqc (reports.html) but no results from multiqc. So I took the zip files obtained by fastqc and the result is the same with the problem on the adapter side.
I only have 100 samples in the other control parameters in the multiqc report.
Can you post screenshots of where you see the discrepancy. MultiQC will recursively look through directories to find reports (depending on how you run it) and it is possible that it may be finding reports besides FastQC for the samples.
No I don't, when I said errors I mean it shows 267 samples for the adapter content while I just have 50 samples of paired-end read = 100 reads (r1+r2).
I think it just counts the total number of lines while there could be multiple adapters types per sample (ex : illumina_universal_adapter, polyG, polyA, ...). I think there may be a small issue with how the title of the plot is coded but there's nothing to worry about.
I checked on some of the samples I had with poly_G Illumina issues before trimming and I have the same observation : 5 samples but the plot of adapter content tells me 9 samples. When I look closer I have indeed 9 lines with illumina_adapter lines and polyA lines for each samples.
what was your input data?
My input data are DNA reads paired-end obtained from illumina.
not 100% familiar with the mutliQC details but is it possible that it gets a fastqc report per read? So for paired-end reads you will get two 'report files' for multiqc?
(even if so you still have a discrepancy of 83 :/ )
I have 50 samples with paired-end data, which gives me 100 (50x2=r1+r2). So I have one fastqc report per read (100 reports). But I can't understand why the report shows 283 samples instead of 100 like the other parameters in the report.
ok, makes sense. I was on the right track though ... :)
does it not included other fastqc reports it found? perhaps there are subdirs or such in which multiqc also finds reports?
Can you check what exactly is included in the multiqc report?
Exactly. I tried to run the analysis only on the files obtained by fastqc (reports.html) but no results from multiqc. So I took the zip files obtained by fastqc and the result is the same with the problem on the adapter side.
I see.
You can always contact the tool developers: https://github.com/MultiQC/MultiQC
if you do and get a satisfying answer, can we as to get back here post it here as well and provide a definite answer. thanks
Can you post screenshots of where you see the discrepancy. MultiQC will recursively look through directories to find reports (depending on how you run it) and it is possible that it may be finding reports besides FastQC for the samples.
Here is the screenshot, in this report multiqc missed 3 samples (6 reports ) and still made the error with the adapter content.
Do you have any oddities in sample file names that were missed in the report (e.g. spaces, odd characters etc).
Not sure what error you are referring to.
No I don't, when I said errors I mean it shows 267 samples for the adapter content while I just have 50 samples of paired-end read = 100 reads (r1+r2).
That is pretty odd. Like you were advised above you could try to open an issue with the devs. I don't think I have ever seen this problem.
I will tag Phil Ewels (Phil Ewels ) lead dev of MultiQC. He may chime in here.
Thanks for spotting! As mentioned below, it's samples * top adapters. I've made an issue here: https://github.com/MultiQC/MultiQC/issues/3153