Bulk rna sequencing
0
0
Entering edit mode
3 months ago
Rosie • 0

Hello, I really need help....

I am currently working on doing differential expression analysis of bulk rna sequencing data. However I end of getting very different F, p and p-adj as compared to the report produced by R-shiny. the logFC is quite similar, its like I am getting 1.400 logFC using a R script and R shiny gets 1.3947 logFC on similar one gene.

I am really confused as to why is this happening.

My script I used in R is:

#Read the subread_counts file
x <- read.csv(file = 'filename.csv', header = TRUE, row.names = 1)

#Setting group identifier
group <- factor(c("Co","Co","Co","T","T","T"))

#Creating a DGElist object
dge <- DGEList(counts = x, group = group)

# Filtering  lowly expressed genes
keep <- filterByExpr(dge)
dge <- dge[keep, , keep.lib.sizes = FALSE]

# Normalize the data
dge_norm <- calcNormFactors(dge)

dim(dge_norm)

# Estimate dispersion
Disp_dge <- estimateDisp(dge_norm)

# Fit the model
fit <- glmQLFit(Disp_dge)

# Perform the differential expression analysis using quasi likelihood ratio
result <- glmQLFTest(fit)

# Extract the top differentially expressed genes
top_genes <- topTags(result, n = nrow(result$table))

# Save the results to a CSV file
output_file <- "<pathtodirectory/filename>"
write.csv(top_genes$table, file = output_file)
RNA-seq differential-expression • 533 views
ADD COMMENT
0
Entering edit mode

Are you sure that you set the parameters that Rshiny app is using in its inbuilt function and using the same version of tools that R Shiny have used. There will be changes with regards to parameters set and version of tool that is generally known. Also random seed set influences the result a lot. Based on the random seed set the results vary. If you change the random seed then results inverse.

ADD REPLY
0
Entering edit mode

I am very new to all this stuff. Is there a way I can check what parameters is Shiny app using?? Thank you

ADD REPLY
0
Entering edit mode

Please contact the person who developed the app

ADD REPLY
0
Entering edit mode

May you please show the output of:

str(x)
str(group)
ADD REPLY
0
Entering edit mode

Hi, It is not necessarily surprising to have small differences between several DGE methods. But if you applied, with DESeq2 and edgeR for example, a protocol with:

  • same genes pre-filtered
  • same normalization method
  • same statistical test
  • same design
  • same regression method

Your p-values should be positively correlated, not necessarily similar but with the same p-values ranking, especially for the top genes example here

If you don't use the same statistical test (Wald test vs LRT) the ranking should be preserved for the top of genes, but no linear correlation. It is not theoretical argument, I have test it in practice

ADD REPLY

Login before adding your answer.

Traffic: 1923 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6