Entering edit mode
4.4 years ago
re_raz
▴
70
I downloaded processed data from GEO. It is a cell counts matrix saved in rds file:
List of 3
$ exon :Formal class 'dgCMatrix' [package "Matrix"] with 6 slots
.. ..@ i : int [1:6960887] 0 1 2 3 4 5 6 7 8 9 ...
.. ..@ p : int [1:17058] 0 95 181 265 353 444 534 622 716 803 ...
.. ..@ Dim : int [1:2] 34463 17057
.. ..@ Dimnames:List of 2
.. .. ..$ : chr [1:34463] "KLK11" "CCDC159" "C7orf50" "RP11-1437A8.4" ...
.. .. ..$ : chr [1:17057] "TTGCTAAGCAGT" "AACGACGGGTCT" "AACGGGGGCGAG" "CAGCAGAGGGTC" ...
.. ..@ x : num [1:6960887] 1 1 1 1 1 1 1 1 1 1 ...
.. ..@ factors : list()
$ intron :Formal class 'dgCMatrix' [package "Matrix"] with 6 slots
.. ..@ i : int [1:10773470] 0 1 2 3 4 5 6 7 8 9 ...
.. ..@ p : int [1:17058] 0 558 1249 2235 2793 3699 4218 5038 5546 6377 ...
.. ..@ Dim : int [1:2] 23682 17057
.. ..@ Dimnames:List of 2
.. .. ..$ : chr [1:23682] "RP11-505K9.4" "RP11-718G2.5" "C3orf22" "PIGZ" ...
.. .. ..$ : chr [1:17057] "TTGCTAAGCAGT" "AACGACGGGTCT" "AACGGGGGCGAG" "CAGCAGAGGGTC" ...
.. ..@ x : num [1:10773470] 1 1 1 1 1 1 1 1 1 1 ...
.. ..@ factors : list()
$ spanning:Formal class 'dgCMatrix' [package "Matrix"] with 6 slots
.. ..@ i : int [1:271900] 0 1 2 3 4 5 6 7 8 9 ...
.. ..@ p : int [1:17058] 0 5 19 33 44 52 61 71 75 87 ...
.. ..@ Dim : int [1:2] 20176 17057
.. ..@ Dimnames:List of 2
.. .. ..$ : chr [1:20176] "STX4" "GPAM" "RPL3" "NDUFAF1" ...
.. .. ..$ : chr [1:17057] "TTGCTAAGCAGT" "AACGACGGGTCT" "AACGGGGGCGAG" "CAGCAGAGGGTC" ...
.. ..@ x : num [1:271900] 1 1 1 1 1 1 1 1 1 1 ...
.. ..@ factors : list()
How can I analysis and extract gene expression for every gene? Is Seurat package work?
What is your analysis goal? Looks like single-cell data. Please read any single-cell guide first, e.g. Seurat manual or OSCA workflow. Biostars is great at providing help with specific questions, but these open-ended questions are typically not appreciated simply because users are usually reluctant to take you by the hand if no specific problem is obvious and you require a complete tutorial.
ATpoint, I really wonder - do some features of the presented data really provide some clues on it's single-cell origin? Or you guessed just because TS mentioned Seurat?
dgCMatrix
is a sparse matrix format commonly used in the single-cell world. dimnames such asTTGCTAAGCAGT
look like cellular barcodes,commonly used in single-cell applications. Assays such asexon
andintron
probably represent spliced and unspliced counts, often input for things like velocity analysis, so this is probably scRNA-seq. Still this is just a best-guess, which is why OP should elaborate.Ok! Thank you for the explanation!
Thanks for your answer, and please accept my apologies for my open question because I am new in bioinformatic and I lost when I was looking for tutorials on websites. scRNA-seq were aligned and count matric assembled using STAR and dropEst. I need to calculate gene expression from the matric
No problem. As said, I suggest you go through https://osca.bioconductor.org/ as this extensively covers the most common topics. Seurat is fine as well but I personally like the Bioconductor tools more since documentation (for me personally) is better (=more extensive).