Hi,
I'm trying to obtain raw counts for rnaseq expression data for breast cancer. I've extracted the data from the TCGA portal for RNAseq V1 for breast cancer instead of V2 because the latter does not posses "true" raw counts as pointed out elsewhere (non-integers): http://seqanswers.com/forums/showthread.php?t=42911
I was also guided to the ICGC data portal with the hopes of obtaining an already parsed table, which I downloaded for RNA-seq raw counts as well (exp_seq.BRCA-US.tsv
). However, when I tried to double check if both sites (TCGA/IGCG) were in agreement in term of raw counts data for the same individual, I found out that this was not the case. For example in TCGA I find that:
Gene TCGA-AN-A0FL-01A TCGA-AN-A0FT-01A
ACAP3 4832 2580
ACAT1 8202 1916
while for ICGC, the same samples raw count values were:
Gene TCGA-AN-A0FL-01A TCGA-AN-A0FT-01A
ACAP3 0 1148
ACAT1 0 896
Both sites (TCGA / ICGC) state that they are representing raw counts for RNAseq expression data. Am I misinterpreting something here, is there an extra-normalization step not shown?
I would appreciate any help, thanks!