Hey All,
I have almost 160 output files in a folder from the featurecounts (quantification of RNA-Seq) and now i want to put that in one datframe to be use for DESeq.
The format of the fetaure counts result is havinf 7 columns:
Geneid Chr Start End Strand Length Sample1
What is need is a dataframe with 1st column as geneid and rest taking the 7th column from r=each file
Geneid Sample1 Sample 2 Sample 3 Sample4 .......................
The code that i have used is :
library(limma)
library(edgeR)
myfiles=list.files(pattern = "*.txt" ) #reading files from the directory having 160 samples.
x <- readDGE(files, columns=c(1,7))
counts=as.data.frame(x$counts)
But i get the error
" Error in read.table(file = file, header = header, sep = sep, quote = quote, :
more columns than column names" after running x <- readDGE(files, columns=c(1,7))
FeatureCounts can take multiple alignment files as input.
@sej i already have the result from the featurecounts and have multiple txt files, i need to combine all the txt files in one dataframe.
You can merge multiple dataframes by rownames assuming that unique geneids are defined as rownames. https://stackoverflow.com/questions/7739578/merge-data-frames-based-on-rownames-in-r
While you could do what @Sej has already suggested, rerunning featureCounts with the right order of BAM files (so you should not need to mess with the matrix later) will be much easier to get a single matrix file.
If you have run featurecounts on each BAM file separately and you would like to merge the counts from all individual files, check this article for merging the counts from all individual files https://www.reneshbedre.com/blog/featurecounts-matrix.html