Dear All,
I have been trying to download the Ovarian Cancer RNA-Seq data V1 and V2 from the TCGA portal, giving the Data Type as both RNASeq and RNASeqV2 across with category for TN (Tumor, matched normal) and NT (Normal, matched tumor). Ideally for other cancer type I found that it typically returns both TN and NT data (in blue and yellow color code). This means there are expression data for same patients from both the tumorous tissue as well as the normal blood for the Breast cancer patient( not necessary its of same patient, may be of different tissues of the same patient). Those are exact relevant match pairs but the same cannot be found for the Ovarian Cancer. There is no availability of both RNA-Seq or Microarray data for Ovarian Cancer patients in TCGA for both TN and NT. I would like to know if I can get such data from some other public databases(if someone has idea), also if someone is ready share me some data or where I can have just expression matrix(raw read counts) from both normal and tumor patients (minimum 5) . Not necessary that the normal has to be from the same patient. It is just a pilot experiment I want to perform to find some correlation with my mutation data. If anyone can give me information of any data repository as well , I will try to retrieve it from there the ovarian expression data. Thanks for the help.
You could search GEO. I don't know how relevant it is to you but I found this. You can refine your search and see what comes up.
Yes I have actually found a GEO data set for microarray that much suffice to my need and am currently trying to go through the paper right now. It is this set where I have microarray from both high and low grade serous ovarian data and also normal ovarian surface epithelium. I would first try out this data and if it does not give me some desired output I will try out with the data you suggested. But since I am more into both grades of tumor so I need a comprehensive expression data coming from both high and low grade and also having normal tissue as reference.
@komal.rathi
I would like to know that if the series matrix file which contain the probe intensities for all the sample which am using are normalized or not? I have earlier worked with few microarray data where I used the .cel files to generate the expression data set and then performed differential expression. But I would like to know if you have ever performed with the series matrix file? If so are these data normalized? Usually as per the blogs and GEO website they are mostly normalized sets so I can directly perform differential analysis on them. But I would still like to know something from people who have already handled GEO microarray series matrix files for differential expression analysis. I did a very simple script to find the DEGs. Did not go for any package. Just selected the samples 6 vs 6 (normal vs tumor) from matrix file(assuming they are normalized probe intensities). Then divided them into two conditions, calculate the pvalue and the adjusted pvalue and then also calculating log2(fold change). Thus selecting DEGs with padjusted <0.05 and fold change of 2 up and down. If you have performed such earlier can you please let me know if this can be done or not? Or do I have to normalize the series matrix file? Thanks
The series matrix files contain normalized data, but you will need to read the paper and GEO metadata to determine how. For the set that you are using, raw .CEL files are available, so you could start with .CEL files just as well.
Try Biojupis.