Entering edit mode
5.3 years ago
MutationalMeltdown
▴
200
I'm interested specifically in 'failures' I got on R2 (R1 being just the barcode + UMI). Note this is 10X sincle cell RNA-sequencing (scRNA-seq) data.
Test 'failures':
- Per tile sequence quality (quality appears slightly worse at about 54bp (the middle of the read), with a corresponding increase in Ns; I'm not sure how to interpret this, having read this)
- Per base sequence content (at start of read only, for this reason I think)
- Per sequence GC content (there is a shifted peak to the left, i.e. AT-rich)
- Sequence Duplication Levels (I guess due to highly expressed genes)
- Overrepresented sequences (no hits; up to 1%)
- Kmer Content (again at start of read only)
Thank you for your help in interpreting this!
I have generally not bothered to run FastQC on 10x data. Once you do the analysis using 10x software you can start paying attention to the metrics (no of cells, reads per cell etc).
You could run FastQC on data 10x makes available on their site to see how your data compares.
Thanks for the suggestion! I will do that.
Perhaps my question is too broad, but the thing that concerns me the most is that there is 1-2bp with slightly lower quality, although not bad enough to fail the "Per base sequence quality" test. These correspond to red blocks on the "Per tile sequence quality" and a small spike in the "Per base N content". Is this something to be concerned about?
I've ran CellRanger and the downstream metrics you mention such as reads per cell etc. seem okay
If you had a few bad tiles or N's in places those sequences should have been taken care by
STAR
, whichcellranger
users to do the alignments. If your run metrics look okay in CellRanger then you are probably fine to proceed.Thanks! I think the run metrics are okay, but "Reads Mapped to [the mouse] Genome" is 90% for some samples and 80% for others, which is lower than in the 10x example data you linked to where it's over 95% (e.g. here), would that concern you?
Test data provided by 10x is likely ideal. So it is not terribly surprising that yours does not look as good. 90% alignments is not bad but this is something you will need to judge yourself. I will see if one of the other mods with more experience with user data wants to chime in.
It's a bit easier to judge the results if you provide the actual figures. From what you've been describing, neither FastQC nor CellRanger summaries are raising any red flags.