Finding FASTQ files for TCGA Data
1
0
Entering edit mode
5.1 years ago

Hey everyone. I'm building a RNA-Seq analysis pipeline, but I need to find some TCGA FASTQ files to work on. I see that the TCGA data are now at GDC, but the only fastq files they have are controlled access, and I won't be able to get that, as I don't have an institution right now. I also saw that there are tools for converting BAM files to FASTQ, but I assume that won't cover reads that don't map to the genome, and it won't allow me to see all the quality scores in the FASTQ. I need this to test various hypotheses. Does anyone know how to get hold of some actual, original FASTQ files?

Thanks very much!

RNA-Seq tcga fastq • 3.7k views
ADD COMMENT
2
Entering edit mode

As you already answered yourself, FASTQs are controlled access.

ADD REPLY
2
Entering edit mode

...so are BAMs in TCGA, ICGC, dbGaP if data come from human patients. The BAMs should include unmapped reads unless they filtered them out. Hopefully not as these might be used for structural variant detection etc. Anyway, if you do not have access, nothing you can do about it unfortunately. It is also not permitted by the terms of use of these databases to share data from one user to another who is not part of the access application of a respective project. For building pipelines, maybe a different dataset might serve the same purpose? There are plenty of open-access datasets available in GEO and ENA.

ADD REPLY
0
Entering edit mode
5.1 years ago

If you really need the fastq files in it is original from why not find another cancer dataset where they are available? Recount2 is a nice place to search for such dataset.

ADD COMMENT

Login before adding your answer.

Traffic: 1633 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6