Hi all, new to Bioconductor/scRNA-seq analyses here.
For an upcoming project, I have chosen to analyze this dataset, "Generation of a Broadly Useful Model for COVID-19 Pathogenesis, Vaccination, and Treatment".
I have preprocessed and loaded the .tsv file (from supplemental files in GEO) into a SingleCellExperiment. My understanding is that the row names are the individual mouse genes and the column names are the six samples that were taken.
Here is how my SingleCellExperiment looks in R:
> sc3.sc #what my sce is called
class: SingleCellExperiment
dim: 55339 6
metadata(0):
assays(1): counts
rownames(55339): 0610005C13Rik 0610006L08Rik ... n-TStga1 SARS-CoV-2
rowData names(0):
colnames(6): Ad5.Empty.rep1 Ad5.Empty.rep2 ... Ad.hACE2.rep2 Ad.hACE2.rep3
colData names(0):
reducedDimNames(0):
altExpNames(0):
What is your advice going forward/what should my end product be?
Furthermore, can I create a clustering map with this type of data/how applicable will that be with this dataset (individual cells do not seem to be in the columns, as there are only 6)?
I do not really have a particular hypothesis in mind; I would like to know what end product makes the most sense given this dataset and data type.
That is not really a way to approach a project - you should have some idea of what you're looking for. Without that, the dataset is just a bunch of random data points. Why did you pick this dataset in particular? What about it appeals to you?
I picked this dataset because I want to do some clustering related to COVID-19 and its effect on the lungs.
What can you tell me about this dataset, the attributes it contains and how it relates to Covid-19 and lungs? What does this dataset have that other Covid-19 datasets don't? What was the process by which you arrived to this dataset?
To my understanding (I am really new to this), this dataset seems to have mouse genes in the rows and 6 columns which seem to be the test subjects. This dataset seems to involve mice that were gene-edited so they could become susceptible to the effects of COVID-19 respiratorily. I was just looking for COVID-19 datasets on GEO, that is how I found it.
I am still wondering what the practical possibilities are when it comes to graphing this data set (does this mean clustering, etc.).
Just some training, I presume? You could try to follow the methods that my colleague and I used here: https://www.biorxiv.org/content/10.1101/271411v1.article-info
We never published that. Publishing already-published data in this way is difficult.
Yes - it was for training, thank you!