Raw counts of scRNA-seq
1
0
Entering edit mode
3.4 years ago
zizigolu ★ 4.3k

Hello all

I need a PBMC data set from any solid tumor (patient with cancer)

I found this set which has 9 PBMC samples GSE114725

I looked at GSE114725_RAW.tar , GSE114725_rna_imputed.csv.gz and GSE114725_rna_raw.csv.gz

They say that the processed data are in supplementary files

I looked at the matrices of counts inside these files but I see numbers like 0.1 , 0.2

I am wondering which data type they are

Normalized?

Please somebody give me an idea

Thanks

scRNA-seq • 1.3k views
ADD COMMENT
1
Entering edit mode

Hi,

Did the values 0.1 and 0.2 appear in the table GSE114725_rna_raw.csv.gz?

I just checked the first 1000 entries of this table:

library("dplyr")
df <- read.csv(file = "Downloads/GSE114725_rna_raw.csv", header=TRUE, dec = ".", nrows = 1000)
df[,-c(1:5)] %>% rowSums(.) %% 1 # check if the row sum of counts *per* cell is a double/decimal number

I sum the total counts per cell in the first 1000 cells and all the number are integers (the remainder of the division is zero), which is odd if you have decimals in the data.

Therefore, are you sure that these numbers are coming from this data?

António

ADD REPLY
2
Entering edit mode
3.4 years ago
Gordon Smyth ★ 7.6k

As the file names make it clear, there are two files, one containing raw counts and the other containing imputed counts. The raw counts are integers, as can be easily checked:

> x <- read.csv("GSE114725_rna_raw.csv.gz")
> y <- as.matrix(x[,-(1:5)])
> y[1:5,1:5]
     A1BG A2M A4GALT AAAS AACS
[1,]    0   0      0    0    0
[2,]    0   0      0    0    0
[3,]    0   0      0    0    0
[4,]    0   0      0    0    0
[5,]    0   0      0    0    0
> max(y-round(y))
[1] 0
> min(y-round(y))
[1] 0

It is the imputed values that are not integers and indeed often negative. The imputed values are explained at some length in the published Cell paper that describes this dataset. If you want to understand what they are in detail you would obviously need to read the paper.

ADD COMMENT

Login before adding your answer.

Traffic: 2565 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6