Entering edit mode
7.4 years ago
MAPK
★
2.1k
I am trying to understand the impact of CNVs on allele frequency. Suppose I have two cohorts:
A. 100 germline samples and 10 unmatched cancer samples.
B. 100 cancer samples and 10 unmatched germline samples.
Assuming there were cases of CNVs occurred in these cancer samples, I want to understand the impact of specific minor alleles (those that are too rare and too common) in both the cohorts. I am thinking the frequency of minor alleles that are too rare (say maf <1%) is going to increase in cohort A, whereas in cohort B (given the same cancer subtype) the frequency of minor alleles that are too common (say maf >10%) is going to increase. Can someone please correct me if this is not the case. Thanks for clarifying.
For me, it is very hard to understand the question.
1) Are you assuming you also have SNP data and the allele frequency you are talking about is the SNP MAF? Or something else?
2) Are you assuming that your CNV are in cancer tissue?
3) Are you assuming you CNV are preferentially copy losses?
Some more details would help
Hi Fabio,
The reason I am asking this question is because I want to extract the SNPs that are not (or least) impacted by the CNVs effect in the cohort (i.e. from the multigenome VCF file).
Not sure if I understood everything, but:
If you have a copy gain in a SNP in cancer, then this will probably lead to an increase of MAF in cohort B. If you have a copy loss in cancer, you will likely have a decrease of MAF in cohort B. I don't understand the link with being rare or not.
Do you have the CNV data?
No, I don't have CNV data. I think the best I can do to remove any SNPs from the VCF file that are potentially coming from CNVs is by excluding indels and any non-biallelic SNPs. What do you think?