How to get public cancer RNA-seq data?
6
0
Entering edit mode
9.7 years ago

Hi

I am master degree student of Bioinformatics. My term project is analysis of lung adenocarcinoma variant calling and structural variation analysis. So I need the fastq files of that cases. I have https://commons.era.nih.gov/ PI accounts of my professor that allow it and I search dbGaP and SRA. How to get Fastq file of that cases?

RNA-seq • 5.8k views
ADD COMMENT
2
Entering edit mode
7.6 years ago
craig ▴ 50

Using Repositive will (currently) give you 3,112 datasets using the term "cancer RNA-seq"

You can view the results here (you don't even need an account): https://discover.repositive.io/datasets/search?query=cancer%20RNA-Seq

Having a quick look, these 3,112 datasets come from 10 different databses, including SRA and dgGaP but also Array Express, InsilicoDB, Methycancer, Xpressomics, Gene Expression Atlas, GEO, GigaDB and Horizon Discovery.

enter image description here

ADD COMMENT
1
Entering edit mode
9.5 years ago
alolex ▴ 960

Your advisor will need to request access to the TCGA protected data via dbGaP, and list you as a collaborator. Once approved you will need to download the key file through dbGaP Authorized Access Login. Then go to CGHub and follow the instructions to download their GeneTorrent software (https://cghub.ucsc.edu). This is the ONLY way to access raw sequencing data, including FASTQ files. Once all this is done you can use the CGHub Browser to identify the cases you want and create a manifest XML file. This file can be passed to the GeneTorrent download program (gtdownload) and it will download the files you need. Hope this helps if you haven't already figured it out.

ADD COMMENT
0
Entering edit mode
ADD COMMENT
0
Entering edit mode

Thanks your reply.

But I know that information and I wanna more detailed practical issue.

Maybe this problem needs to study manual and tutorial.

ADD REPLY
0
Entering edit mode
9.5 years ago

Look into the NCBI GEO database. It is full of cancer data

ADD COMMENT
0
Entering edit mode
9.4 years ago
matt.newman ▴ 170

You may want to check out OncoLand: http://www.omicsoft.com/oncoland-service. This won't get you access to the FASTQ files, but it's a great tool for accessing Oncology data.

ADD COMMENT
0
Entering edit mode
6.5 years ago
j.salys • 0

To get GEO data using python you can use GEOparse package. For more details. Check out this blog post: http://friendlypython.herokuapp.com/2018/6/analysis-geogene-expression-omnibus-data/

ADD COMMENT

Login before adding your answer.

Traffic: 2427 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6