I am doing a study based on GEO RNAseq data through galaxy server. Quality control and mapping of the data was done, the results didn't have any problem but there is an issue; some of the samples were aligned 40%. By the way the alignment's results show that this low alignment score is due to mapping to multi loci rather than unmapping. Should I disregard it due to my study design (the data belongs to mRNA expression analysis) ? or the cause is an external contamination and I shall solve it before continuing the analysis ?
Please stop using RNA seq data analysis as tags: the website separates each work into its own tag. Instead, use RNA-seq as a tag - you don't need the "data analysis" part.
By the way the alignment's results show that this low alignment score is due to mapping to multi loci rather than unmapping.
That is not how usually things go. A multimapper is still a mapped read, just not uniquely. If 40% is mapped then 60% is unmapped. That is unusual and might indicate contamination. Be it genomic DNA, or foreign DNA/RNA. Impossible to know for us.
With published data that is always a pain since you essentially have zero information how the libraries were actually made and what happened in the lab. I would do the usual downstream analysis, like PCA and see whether this looks meaningful. If not, maybe look for other datasets. It's hard to give good advise here since there is zero context.
Thank you. 40% were mapped uniquely, near 50% were mapped uniquely and just some reads were unmapped. you're right, I will continue the downstream analysis.
Please stop using
RNA seq data analysis
as tags: the website separates each work into its own tag. Instead, useRNA-seq
as a tag - you don't need the "data analysis" part.