Hello,
I hope you are safe and well.
Could someone share what a count matrix for input into Seurat is supposed to look like?
I have count matrices however they each cells count matrix is in a separate file.
This is the data I want to analyze in Monocle 3: https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSM2978831 The files I'm looking at are in the Supplementary section.
I was guided to aggregate these files into one count matrix file and then bring it in Seurat to normalize it. Then, from Seurat, transform the normalised data and use it as input to Monocle.
This is what I have used to aggregate the data:
> setwd ("~/Desktop/GSE110154_RAW/csv/")
> files <- list.files(path="~/Desktop/GSE110154_RAW/csv/")
> genes <- read.table(files[1], header=FALSE, sep=",")[,1]
> df <- do.call(cbind,lapply(files,function(fn)read.table(fn,header=FALSE, sep=",")[,2]))
> df <- cbind(genes,df)
> head (df)
which results in:
genes
[1,] "1/2-SBSRNA4" "0" "0" "0" "0" "0" "0" "3" "0" "77" "0" "0" "0"
[2,] "A1BG" "0" "0" "0" "58" "0" "0" "0" "0" "0" "0" "0" "0"
[3,] "A1BG-AS1" "0" "38" "0" "0" "0" "0" "0" "0" "0" "0" "0" "0"
[4,] "A1CF" "0" "8" "0" "123" "8" "418" "0" "144" "0" "108" "21" "0"
[5,] "A2LD1" "0" "0" "0" "0" "0" "0" "0" "0" "0" "0" "12" "0"
[6,] "A2M" "0" "0" "0" "0" "0" "0" "0" "0" "0" "0" "0" "0"
and then to write the files I did:
> write.table(df, "~/Desktop/GSE110154_RAW/df4.csv", row.names = F, col.names=F, sep = ",")
which results in:
"1/2-SBSRNA4","0","0","0","0","0","0","3","0","77","0","0","0","0","3","0","0","1","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","7","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","5","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","22","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","2","0","0","0","0","0","38","0","0","0","3","0","0",...
"A1BG","0","0","0","58","0","0","0","0","0","0","0","0","0","10","0","0","0","0","0","0","0","0","0","0","0","23","0","0","0","0","0","0","0","0","0","3","0","0","0","0","0","0","0","0","0","0","10","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","2","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","14","0","0","2","35","0","0","0","0","0","40","0","0","0","0","0","0","0","26","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","6","0","0","0","0","0","0","0","11","0","0","0","0","0","38","0","0","0","0","0","0","42","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","30","0","0","0","0","0","0","0","0","0","0","0","0","0","13","0","0","0","0","0","0","18","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0",...
etc...
I'm just not quite sure what a Seurat count matrix is supposed to look like?
I also need to find a good tutorial on how to input this data into Seurat afterwards, normalize and transform it, to input into Monocle3.
I would greatly appreciate anyones help!
Very Respectfully, Pratik
Side note: Avoid setting working directories in R code. Create a directory for each task/project/whatever, and either create an R project there, or store the R script and read files using their full path. Allow for the fact that files can be moved - so create soft links to files in your working folder. That way, you'll at least have a record of where the file was (and hopefully the file is saved in a better location than on the Desktop or in the Downloads folder).
Thank you RamRS!
Any clue on what a count matrix for input into Seurat is supposed to look like?
Or a tutorial on how to tutorial on how to input this data into Seurat, normalize and transform it, to input into Monocle3?
I'm not a single cell RNAseq person, others should be able to help you with that. I had some suggestions on basic R programming/project organization practices, which I mentioned.
Thank you RamRS, I really do appreciate your guidance! I eventually want to become proficient at R programming/project organization practices.
Very Respectfully, Pratik
This is incredible wisdom I was thinking about today. Thank you for looking way ahead for me. Although I was kind-of rudely snappy/out-of-place like a turtle haha... This is valuable to me now as I try to be better organized. Thank you for looking out Ram : )
Also this StackOverflow question/answer helped supplement your guidance to explain it to me like I am 5 : )
Glad it has been helpful, Pratik! I myself still stick to these rules to ensure easier context switching - loading an R project would bring all relevant scripts, files and plots into a sandbox that I can then play in.
Hi I have a similar problem like you Mr. Pratik Mehta... did you find out how to handle it and How to give count matrix to seurat as input ?
Hey yea, you just need to have your cells as columns and your genes as rows. The "meat" of the data frame should be your counts. This should be good to input into Seurat.