Dear all,
I want to find mutation types, copy number alteration and methylation profile for each individual patients.
I am using TCGA data and only results of mutation analysis give me the type of mutation for each individual TCGA ID.
My question is if there is any methodology which give me patient by patient results for copy number alteration or methylation analysis?or it is not applicable to this kind of data?
I am looking forward your comments
Kind regards
Nazanin
Hi Kevin! Yes, you're right. I have already performed copy number alteration analysis, however my supervisor has asked me if I can get copy number alteration as well as methylation status in each individual patient similar to what I have obtained from mutation analysis in which the results are available for each patient. I explained to him that CNA and methylation analysis are based on comparing the average values between cancer and normal samples (similar to DESeq2 results for RNASeq and mirSeq data) and we could not get the results for each individual patients. Am I right? I appreciate your helpful comments Kevin! Regards Nazanin
Actually, it is the mutation data that you [probably] have that is the one that is derived by comparing each tumour to each matched normal sample. Can you confirm how you obtained the mutation data?
The copy number and methylation data are not like this, and, instead, represent each sample on its own, whether it be a normal or tumour sample.
You can infer the sample type from the barcode: A: Meaning letters in TCGA sample barcode field
I have applied TCGABiolinks package to perform mutation analysis and in the resulting table there is a column of TCGA code for each patients. However it seems that in the CNA and methylation analysis the algorithms compare the average values between two populations (cancer Vs. normal) and therefore the results is not available for each individual TCGA samples.
I hope I could clarify what I mean
Well, that is specific to TCGAbiolinks, in that case. The data that you need is available through the links that I posted in my original answer.
Thanks Kevin for your help.
Hi Kevin! I have around 100 patients without any mutations and I wanted to see what is the pattern of their copy number alterations and methylations. Would it be correct if I consider the segment_mean value and B value columns of those patients? Or I need to do some preprocessing? Nazanin
Hey, the segment mean is explained here in the GDC documentation:
Also, see:
So, a value of 0 implies copy number = 2 (diploid normal copy number).
There should be segment mean values for both tumour and normal samples in your datasets.
Hi Kevin!
If I understand correctly, if I want to compare segment means in each individual sample, I need to have pairs of tumor and normal samples.
However as you know the number of normal samples are very small compared to tumor samples and therefore I do not have pairs of tumor and normal samples.
It seems then I can not check the copy number alteration in each individual patients. Am I correct?
Nazanin
Yes, for some of the TCGA cancers, the number of matched normal samples is low (or there are zero normal samples). You could create a 'panel of normals' using the available normal samples? - It could comprise the mean of the segment means across all normal samples. This would be possible using GenomicRanges
Hi, Kevin, You mean I can generate a representative normal sample using GenomicRanges and then compare it with each individual tumor sample of my interest?
Yes, that is what I mean.
Thanks Kevin, Just for clarification, I have to use findOverlaps()?
Oh yes, that would be the one. I think that i have put examples online about using GenomicRanges, but there are other examples on Bioconductor forum, too.
You will have to then find a way to determine the mean of the
segment_mean
across normal samples. Also, you have to determine how many base-pairs should the CN regions overlap for the purpose of merging them across all normal samples Try to draw it out on a piece of paper, maybe, and then decide on an analysis strategy.For example:
Thank you so much Kevin! Our team is in debt to you for all your helps
Hi Kevin!
I made a representative normal sample based on the overlaps of regions among several normal samples.
Now my questions is how can I compare the segment_mean values of each of my tumor samples with the segment_mean of the constructed normal sample?
Should I compare segment_means manually?
If yes what threshold should I apply to detect true gains and losses in each single comparison?
Kind regards
Nazanin