I took bulk RNAseq data done on leukaemia cells into R and I carried out deseq. I have three questions:
I am confused about the limits people place on differentially expressed genes (DEGs). So far, I have only filtered for the adjusted p-value to isolate significantly differentiated genes. However, I don't know if I should impose a cut-off for the log2fold change?
How do I map the ENSEMBL ids to the gene name?
Is it possible to filter the DEGs to isolate mitochondrial genes? For example, taking a list such as a list of mitochondrial genes from MitoCarta, is it possible to parse that through R and isolate the results that correspond to the list?
I'm new to bioinformatics, so any help is appreciated!
Try a Volcano plot. You need thresholds on both p.adj and fold-change if you're looking for genes that are significantly differentially expressed AND you want to be confident in your results. p.adj will give you the latter but not the former.
Search the forum and the web, there are many ways to do this. Your best bet is biomaRt - the R package.
It should theoretically be possible. Again, biomaRt is your best bet. Look for a way to map ENS IDs to contig, all MT genes will map to chrM/MT contig.
1) I am confused about the limits people place on differentially
expressed genes (DEGs). So far, I have only filtered for the adjusted
p-value to isolate significantly differentiated genes. However, I
don't know if I should impose a cut-off for the log2fold change?
It is better to set cut off/threshold for log2FC beside adjusted p-values to say your genes are DE in your test compared to control samples. People usually set log2FC value from -1/-2 to 1/2, but it is not universal criteria. You can chose threshold what is the best for your data.
2)How do I map the ENSEMBL ids to the gene name?
Please check in this link below. You can use biomart an R package for this.
3)Is it possible to filter the DEGs to isolate mitochondrial genes?
For example, taking a list such as a list of mitochondrial genes from
MitoCarta, is it possible to parse that through R and isolate the
results that correspond to the list?
Yes it is very easy in R. There are different ways of doing it. In one way you can make a list of mitochondrial genes and merge with your DESeq2 results and get your result.
1) I am confused about the limits people place on differentially
expressed genes (DEGs). So far, I have only filtered for the adjusted
p-value to isolate significantly differentiated genes. However, I
don't know if I should impose a cut-off for the log2fold change?
There is no "rule" per se, however, most people will use an adjusted p-value of < 0.05 and a log2FC > |1|. The log2FC cut-off is applied to help reduce the DEG list to only the most biologically relevant --- this is similar to the cut-off that is applied to other experiments such as western blots or RT-qPCR.
3)Is it possible to filter the DEGs to isolate mitochondrial genes?
For example, taking a list such as a list of mitochondrial genes from
MitoCarta, is it possible to parse that through R and isolate the
results that correspond to the list?
Yes, as @Ram said you can leverage the chrM/MT suffix and identify mitochondrial genes it with a filter(grepl()) command.
It is better to set cut off/threshold for log2FC beside adjusted p-values to say your genes are DE in your test compared to control samples. People usually set log2FC value from -1/-2 to 1/2, but it is not universal criteria. You can chose threshold what is the best for your data.
Please check in this link below. You can use
biomart
an R package for this.How to use biomart on R to convert Ensembl Gene IDs to Symbols?
Yes it is very easy in R. There are different ways of doing it. In one way you can make a list of mitochondrial genes and merge with your DESeq2 results and get your result.