Hi
I have four aligned BAM files for two conditions, treated and untreated,each of them having two biological replicates, I could run featurecounts for the four BAM files separtly,but I dont know how to make these output files compatible for downstream differential gene expression analysis using DESeq2, can anyone help please...
Thank you.
1) Get the count (featurecount is a good tool for that)
2) Provide 3 variables:
a. A matrix of the count (rows=genes, column=samples).
b. A matrix with all the informations on your samples (names, group=control/case etc)
c. A design (Ex: ~group)
3) Run DESeq2 following the manual
Regarding the output of featureCount you should get for each sample a dataframe containing by row the genes (and info) with the last column being the count associated to each gene. You can simply combine the outputs by column to get the matrix of count require by DESeq.
featureCounts produces a matrix directly when you feed it multiple BAM files (provide them in the order you want the samples to be in the matrix columns).
Hi Genomax2, I have the matrix files for all bam files, but how to load these files to Deseq2? I have been using kallisto - tximport-deseq2 pipeline and there it was straight forward,
Thanks
Hi:
For my understanding. Read.delim can only read 1 txt file once. I have used this when using edgeR and limma for differential gene expression. I am also wondering how can merge several Rsubread generated txt files for Deseq2. In edgeR and limma, I have used readDGE but not understand very well.
for example in edgeR and limma manual I have, the import way is:
files <- c("files1", "files2", "files3", "files4)
x <- readDGE(files, columns=c(1,3))
I thought that x is the count matrix but it turns out it is not. using colnames(x) I saw the x already contains my sample names as column names. I don't know how it works so I am also intrested in this question. It looks that feed Rsubread several bam files is a good approach, but if I already have the individual txt files from RSbread what is the solution?
Everything is explained in DESeq2 manual.
1) Get the count (featurecount is a good tool for that) 2) Provide 3 variables: a. A matrix of the count (rows=genes, column=samples). b. A matrix with all the informations on your samples (names, group=control/case etc) c. A design (Ex: ~group) 3) Run DESeq2 following the manual
Regarding the output of featureCount you should get for each sample a dataframe containing by row the genes (and info) with the last column being the count associated to each gene. You can simply combine the outputs by column to get the matrix of count require by DESeq.
featureCounts
produces a matrix directly when you feed it multiple BAM files (provide them in the order you want the samples to be in the matrix columns).Dear Genomax, can you tell me how to feed multiple bam files to featurecounts?
featureCounts [options] -o counts.txt file1.bam file2.bam file3.bam etc
You need a matrix file with genes as your rows and samples as your columns. No need to merge the replicates.
Hi Genomax2, I have the matrix files for all bam files, but how to load these files to Deseq2? I have been using kallisto - tximport-deseq2 pipeline and there it was straight forward, Thanks
help(read.delim)
Hi: For my understanding. Read.delim can only read 1 txt file once. I have used this when using edgeR and limma for differential gene expression. I am also wondering how can merge several Rsubread generated txt files for Deseq2. In edgeR and limma, I have used readDGE but not understand very well. for example in edgeR and limma manual I have, the import way is:
I thought that x is the count matrix but it turns out it is not. using colnames(x) I saw the x already contains my sample names as column names. I don't know how it works so I am also intrested in this question. It looks that feed Rsubread several bam files is a good approach, but if I already have the individual txt files from RSbread what is the solution?
Combine read.delim with lapply. Have a look at how readDGE works internally.