Hi guys, I have got a CNV data from TCGA and the marker file as shown below,
- Seg file:
Sample Chromosome Start End Num_Probes Segment_Mean
TCGA.05.4249.01A 1 3218610 120527361 67456 -0.1725
- TCGA.05.4249.01A 1 149881398 167526508 10663 0.5859
- TCGA.05.4249.01A 1 167526675 167526823 2 -1.1518
- TCGA.05.4249.01A 1 167526972 247813706 50571 0.5816
TCGA.05.4249.01A 2 484222 242476062 130861 0.0585
Markers file:
Probe.Name Chromosome Start
- CN_473963 1 61735
- CN_473964 1 61808
- CN_473965 1 61823
- CN_477984 1 62152
- CN_473981 1 62920
- CN_473982 1 62937
And my goal is to perform GISTIC analysis with the GISTIC 2.0 module in GenePattern, but the result is always like this: "GISTIC version 2.0.23 GISTIC 2.0 input error detected: 76606 segment start or end positions in '/opt/gpcloud/gp_home/users/genye/uploads/tmp/run8835511072592266907.tmp/seg.file/1/biguoshu1.txt' do not match any markers in '/opt/gpcloud/gp_home/users/genye/uploads/tmp/run6032895478607754030.tmp/markers.file/2/markersMatrix.txt'. First bad position is 10:24732567 at line 33."
I have uploaded my files in .txt format and choose the GISTIC version 6.15.28 and Human_hg19.mat as the refgene file. All other parameters were by default. Could anyone please tell me what the problem is and how to solve? Thank you !
Hi,did you fix the problem? I had the same problem
Please show the exact error message, a sample of your input data, and all commands that you have tried. Thank you.
thank you for your reply!
input file:
marker file:
All other parameters were by default thank you !
You could try without the markers file, which is now possible with later versions of GISTIC. Also, just double-check that the formatting of your files is correct.
Thank you for you help Kevin! but another problem arised:I got many regions amplified/deleted. The plots are very noisy, amplification/deletion occurred in almost every gene.Could you please tell me what the problem is and how to solve? Thank you !
Hi, you are not giving me much information with which I could use to begin to help. Please share, in detail, the data that you obtained, and the code that you used to process it.
sorry! The plot is always like this: https://ibb.co/swrq1qD
my maskedCNVsegment data was from TCGA and I perform GISTIC analysis with the GISTIC 2.0 module in GenePattern. All other parameters were by default .The plots are very noisy, amplification/deletion occurred in almost every gene
You took data from the GDC? That data is segmented copy number data produced by DNAcopy, I believe. You then used that as input to GISTIC?
Could you take a look here to see how this matches up to what you have done? - A: How to extract the list of genes from TCGA CNV data
Follow up on this part of the error:
As it alludes to chromosome 10, perhaps one of your files is not sorted numerically, and is instead sorted lexicographically
Thank you for you helping Kevin! But after I sorted my files numerically, it still showed similar result (... do not match any markers...) . Is it possible that the Marker File I submitted doesn't fit TCGA data, or that the online version of GISTIC2 doesn't work at all?
Perhaps you can contact the GISTIC team directly:
I think that I read somewhere, by the way, that the IDs have to be like this:
So, less the final part. Can you try?