Did the values 0.1 and 0.2 appear in the table GSE114725_rna_raw.csv.gz?
I just checked the first 1000 entries of this table:
library("dplyr")
df <- read.csv(file = "Downloads/GSE114725_rna_raw.csv", header=TRUE, dec = ".", nrows = 1000)
df[,-c(1:5)] %>% rowSums(.) %% 1 # check if the row sum of counts *per* cell is a double/decimal number
I sum the total counts per cell in the first 1000 cells and all the number are integers (the remainder of the division is zero), which is odd if you have decimals in the data.
Therefore, are you sure that these numbers are coming from this data?
As the file names make it clear, there are two files, one containing raw counts and the other containing imputed counts.
The raw counts are integers, as can be easily checked:
It is the imputed values that are not integers and indeed often negative. The imputed values are explained at some length in the published Cell paper that describes this dataset. If you want to understand what they are in detail you would obviously need to read the paper.
Hi,
Did the values 0.1 and 0.2 appear in the table
GSE114725_rna_raw.csv.gz
?I just checked the first 1000 entries of this table:
I sum the total counts per cell in the first 1000 cells and all the number are integers (the remainder of the division is zero), which is odd if you have decimals in the data.
Therefore, are you sure that these numbers are coming from this data?
António