Entering edit mode
7.7 years ago
richa_makhijani
•
0
I want to extract differential expressed genes using LIMMA from RNA seq data for three cancer types viz breast, lung and prostate. These data should have tumor and normal samples. I have read some papers which have used data from TCGA. BUt now TCGA has linked to Genomics Data Commons and all data are not open access. All BAM files are under controlled access. Also, LIMMA requires raw read counts for analysis. I an new to RNA seq data and analysis. Can anyone help, where should I get these data and what format should it be, as I have read that TPM, FPKM normalized values cannot be input to LIMMA.
If you want to have access, and it is controlled, you'll have to ask TCGA for access. There is no black market for illegally derived data, at least not that I am aware of.
With bam files you can get raw counts, which is wat you need for limma. You can use featureCounts for that.
Hello richa_makhijani!
It appears that your post has been cross-posted to another site: http://seqanswers.com/forums/showthread.php?t=75838
Please include cross-post links, in future.
You should link the publication where they refer to this data. It might be that they provide the RNASeq reads. In such case you have to map them to get a bam, probably against the human genome version that they declare in the paper.
Do they provide any SRA accession number? (SRR...)
You can take a look at cBioPortal while you decide if you want to apply for access to controlled data.