Differential gene expression using edgeR and using a generalised linear model (GLM)
0
0
Entering edit mode
5.5 years ago

Hi, I am new to "edgeR," and I don't have a basis so much in R programming since I used to work with python for my analysis. The counts that achieved from ht-seq for different samples stored in separate files and different folders; files names are all "counts.txt" and folders named as sample names. The goal is to compare 30 samples with 20 other samples that represent pro-vaccine and pre-vaccine samples. Counts.txt files contain two columns one for gene names and one for counts. As I learned from the documentary, I have to use "readDGE" for reading the "counts.txt" files in each folder. I want to do differential gene expression analysis by using a GLM. Although I have to identify DE genes by using log2 fold change and likelihood (LR) test in edgeR. I don't know where to start and collate the data together to make DGElist and other steps! I start with some codes like below :

library(edgeR)
directory1="/home/ali/Desktop/SAMPLES1/"
directory2="/home/ali/Desktop/SAMPLES2/"
files1 <-grep("counts.txt",list.files(directory1),value = TRUE)
files2 <-grep("counts.txt",list.files(directory2),value = TRUE)
x <-readDGE(files1,columns = c(1,2),header=FALSE)
d<-readDGE(files2,columns = c(1,2),header=FALSE)

And the second fact is that I don't need to normalize my data. So is there anyone can guide me to which steps I have to do?

Thanks in advance

RNA-Seq edgeR R • 1.6k views
ADD COMMENT
1
Entering edit mode

See the manual section on Reading counts from a file on how to get the htseq output into R and how to run an edgeR analysis. I strongly recommend to stick to the default analysis path without putting in custom analysis strategies. Putting together a count matrix for all samples with columns = samples and rows = genes before reading it into R might be desirable as this is the default input format. This can be done with Unix tools such as cut.

And the second fact is that I don't need to normalize my data

What makes you say that. Different samples have different total read counts and library compositions. Normalization is always necessary.

ADD REPLY
0
Entering edit mode

i already read the manual but maybe i have to read it again to understand well! it is a little bit complicated and unclear for me the way that they describe there, and about the normalization that is what my supervisor told me to do so as i know it is a necessary step that we have to do!

ADD REPLY

Login before adding your answer.

Traffic: 1900 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6