I've been trying to analyze a pair of tumor/margin tissues but got very few meaningful somatic mutations. I doubt if I picked the correct pipeline/parameters.
I wonder if there's any publications made with similar samples where the authors provided the raw data that I can download and run with my pipeline. I can thus compare the received result to the published results.
Thank you. Field
To understand what
similar
is you'd need to indicate what you're working on. Regardless, you will need to do the unpleasant work to use PubMed, scanning for studies that have done WES in your tumor entity, and then download the data. Most of the time data are available for download. if these are human data the data might be under protection at dbGaP or similar databases so you'd need to apply for access.If you are in doubt about the pipeline then use something established, e.g. the GATK workflow for WES.
https://github.com/gatk-workflows/gatk4-exome-analysis-pipeline
Hi ATpoint,
Thank you so much for the reply. I've been working on a pair of breast cancer tissues.
Unfortunately I can't put in a whole lot of time into literature searching. I did limited searching but found none providing raw data. Maybe the raw data is too large for download. Also, I emailed to learn how I can gain access to NCI cancer genome common, I was told to follow the steps. https://gdc.cancer.gov/access-data/obtaining-access-controlled-data
I work at a startup company at Hangzhou China. We currently don't have the manpower to prepare the materials for a NIH data access application and do the annual renewal. That's why I've been seeking the shortcut by asking around on the forum.
Thank you so much.
Field
Sorry I forgot to thank you for the GATK recommended pipeline. As a beginner bioinformatic practitioner, I basically struggle between a few available programs for each step of data analyzing. Papers do compare them but rarely give an absolute pick. I would definiely go-over the gatk pipeline and see what I get.