Entering edit mode
8.3 years ago
aggregatibacter
▴
190
Hi,
I have data from human tissue biopsies with different diseases. My pipeline was fastqc - trimmomatic - fastqc - star - featurecounts - voom/limma. I removed rRNAs during library prep, and from the Ensembl annotation gtf during the featurecounts step.
Now, I am puzzled to get quite a lot of ribosomal proteins as differentially expressed. I understand that these are not rRNAs, but I have never seen so many in one place.
Should I be worried, or could this be normal biology?
Cheers
Protein synthesis is often altered during stress. I think it has biological meaning and is not a technical artifact (I have seen it a lot in expression studies).
Thanks for your quick reply. I also found some papers that describe functions in inflammation etc. for these transcripts. They only make up roughly a third of my list, so I was wondering...
It might be a result of bad normalization of the counts. If it's biologically reasonable that there are different number of ribosomes in one condition over the other then it should be valid, otherwise you might see this change while they didn't change because some other proteins changed expression level in the opposite direction and the normalization process missed it. I recommend you to run it with DESeq2 and see if this is reproducible with their normalization.
Did you perform any size-factor normalization? It's certainly biologically possible, but in the steps above you don't mention any steps that would adjust for sample size differences. This would also be an easy way in which the differences could appear.
I used edgeR to introduce the different library sizes (without an extra argument, so it should take the total counts). See code below. Alternatively, I get relatively large (judging from my experience with limma and arrays) weights (some >2 fold). Could this be a problem, too?
Many thanks!