filter genes for differential gene expression
1
0
Entering edit mode
7 weeks ago
bleven • 0

Hello,

I am trying to analyze Human endogenous retrovirus RNA-seq data between two groups. I am doing differential gene expression to see what genes are upregulated and downregulated in control vs. disease. I am having a bit of a hard time since many genes are very lowly expressed or no expression getting any sort of significant difference. I found this paper https://www.frontiersin.org/journals/aging-neuroscience/articles/10.3389/fnagi.2023.1186470/full#B65 where they filter genes first to get rid of low expressed genes. I just want to make sure I am not manipulating that data. How should I filter the data (i.e. take avg of raw gene count and filter, take mean, etc.)

Thank you!

RNA-seq differential-gene-expression • 235 views
ADD COMMENT
0
Entering edit mode
7 weeks ago

The typical approach is something like: filter genes with at least X counts in at least Y cases.

See the DESeq2 example here

However, in the case of using DESeq2 prefiltering is not required for statistical purposes, rather more for memory efficiency and speed of computation.

It's important to understand why you aren't getting significantly differentiated results - have you visualised the data using basic approaches like PCA? Do you see your data splitting across the components based on your two groups? Have you taken the top 50 genes by logFC and plotted a heatmap to see how expression is different across the groups? Do you know what kind of expression differences you could expect - for example what are the typical differences reported in endogenous retrovirus in other groups?

ADD COMMENT

Login before adding your answer.

Traffic: 2482 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6