Entering edit mode
3.8 years ago
shaden
▴
20
Hi, How can I merge CSV files that contain same columns but different numbers of rows in R?(columns are samples, rows are different genes, some files have 500 rows, others have 700 or so), I am doing this to merge multiple files of alignment results into one dataframe to have it ready for differential analysis.
so far this is the code I'm using:
# find all file names ending in .csv
data_path = "Desktop/Reads"
files <- dir(path = data_path, pattern = "*reads.csv")
files
# read in all the files, appending the path before the filename
data <- files %>%
map(~ read_csv(file.path(data_path, .))) %>%
reduce(rbind)
filenames_short<-dir(path =data_path, pattern = "*reads.csv", full.names = FALSE)
name1<-gsub("reads.csv","",files)
data_full <- data_frame(filename = filenames_short) %>%
mutate(file_contents = map(filename,
~ read_csv(file.path(data_path, .)))
)
data_full
Probably inefficient, but the quickest way I can think of would be to subset the 700 gene table so that it has the same genes as the 500 genes one, and merge the two together. Then rbind the merged table to the genes you discarded (200) . Wrap everything in a loop (or function) and you can merge all files you need.
Can you add the first few lines of one of your count files to the post?
Thanks! I have found the answer: readDGE function in edgeR