Why most genes have high padj values

0

Entering edit mode

6 months ago

mnx0723 • 0

After receiving the pair-end fastq file, we obtained the expression matrix through Trimmomatic, STAR, and faetureCounts. I ran the program using all the basic options. After that, we will try to find different expression genes using DESeq2. The following is my data. When filtered based on padj<0.05, the number was much smaller than other samples. After checking the table, the p-value itself was basically high, but there were many padj values higher than 0.05 and many NA values.

I'm not sure what exactly went wrong, but if this problem occurred in pair-end compared to single-end, can you tell me which protocol was the problem?

![enter image description here][1]

dim(deseq_result_200805_1)
  [1] 33850     6
filtered_200805_1 <- deseq_result_200805_1 %>% filter(deseq_result_200805_1$padj<0.05)
dim(filtered_200805_1)
  [1] 788   6

RNA-seq DEG • 580 views

ADD COMMENT • link updated 6 months ago by Ram 44k • written 6 months ago by mnx0723 • 0

0

Entering edit mode

Please do not paste screenshots of plain text content, it is counterproductive. You can copy paste the content directly here (using the code formatting option shown below), or use a GitHub Gist if the content volume exceeds allowed length here.

code_formatting

ADD REPLY • link 6 months ago by Pierre Lindenbaum 164k

0

Entering edit mode

Sorry, I made edits.

ADD REPLY • link 6 months ago by mnx0723 • 0

0

Entering edit mode

788 differential genes is a lot. What is the problem?

ADD REPLY • link 6 months ago by ATpoint 85k

0

Entering edit mode

In the case of other samples, the total number of genes is at least 3,000 more than that one, but it seems to be a relatively small number. I guess I was in a bit of a hurry