Dear all,
I am dealing with some Ion Proton exome data. The average quality values of each read are very low (all are below 30) and many tests in FastQC report were failed. I've found some information about Ion proton quality values here. I was expecting a good amount of data to pass a filter of Q15 (and also Q10) but I didn't get much ( <1% out of >2.5m reads). I've used prinseq to filter out low quality reads (and also removed reads below the length of 70). I was stuck with following questions.
- Is it a problem with the machine or our data is contaminated during library preparation?
- Is it appropriate to proceed to downstream analysis by trimming the low quality bases of each read at the both ends?
- If the data is of very low quality what should I do with the data (Is it waste of time to go for further analysis?)
I would like to know the views of people who dealt with Ion Proton data previously and I can also provide if any details required.
Do u have box plot of per base quality ?
Here is how it looks
This does not look like really bad, how are you processing for QC ? what command are you using to filter ?
After checking the FastQC, I've filtered using
prinseq
givingmin_len
value70
andmin_qual_score 25
. Instead of removing all reads which are of low quality, I trimmed low quality bases at the 3' and 5'. Then FastQC report is a little better than the previous.