how to assign row names and colnames to a sparse matrix
1
0
Entering edit mode
3.0 years ago

Hi,

I'm performing single cell rna sequencing differential expression analyses in R using the public dataset which can be found here: https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSM4952363. It has 3 data files: barcodes, features, and matrix.

These are the commands I have done so far: library(Matrix) mat <- Matrix::readMM("GSM4952363_OS_1_matrix.mtx") features <- read.delim("GSM4952363_OS_1_features.tsv") barcodes <- read.delim("GSM4952363_OS_1_barcodes.tsv")

I am in the process of adding colnames (barcodes) and rownames (features) to the sparse matrix (mat) which contains the expression data, however it gives an error message.

rownames(mat) <- features

Error in dimnamesGets(x, value) : invalid dimnames given for “dgTMatrix” object

Does anyone know why I have received this error message? Please help thankyou

colnames rownames matrix scRNAseq • 3.9k views
ADD COMMENT
0
Entering edit mode
3.0 years ago
ATpoint 86k
library(Matrix)

mat <- Matrix::readMM("~/Downloads/GSM4952363_OS_1_matrix.mtx.gz")
features <- read.delim("~/Downloads/GSM4952363_OS_1_features.tsv.gz",
                       header=FALSE)
barcodes <- read.delim("~/Downloads/GSM4952363_OS_1_barcodes.tsv.gz",
                       header=FALSE)

colnames(mat) <- barcodes[,1]
rownames(mat) <- features[,1]

Mind the header=FALSE because these files have no header, otherwise the first row is used as header and that then makes the files one entry short of the dimensions of mat. Also note that features has two columns, one with Ensembl gene ID and the other with gene names. Up to you what to choose.

ADD COMMENT
0
Entering edit mode

Ok, thankyou so much for your help.

ADD REPLY
0
Entering edit mode

Do you know how I might construct my single cell experiment from this stage?

ADD REPLY
0
Entering edit mode

Type ?SingleCellExperiment and learn single-cell analysis via https://bioconductor.org/books/release/OSCA/

ADD REPLY
0
Entering edit mode

Ok thanks, could you help me understand this first step. I've started with chapter 10 before chapter 4 for preprocessing.

Chapter 10 can be found here - http://bioconductor.org/books/3.14/OSCA.multisample/chimeric-mouse-embryo-10x-genomics.html#chimeric-mouse-embryo-10x-genomics

10.2 Data loading library(MouseGastrulationData) sce.chimera <- WTChimeraData(samples=5:10) sce.chimera

I'm confused as to what (samples=5:10) means and also what WTChimeraData is (ie. expression data or gene names etc) and since I have already loaded my data and merged the 3 files into 1 do I need to do this step?

ADD REPLY
0
Entering edit mode

I am not going to walk you through that book, sorry. Try and error, that is how I learned it.

ADD REPLY
0
Entering edit mode

Is this a good workflow to follow? https://rpubs.com/mathetal/DEGs

ADD REPLY

Login before adding your answer.

Traffic: 1710 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6