Question

Copy number variation and its impact on allele frequency

0

Entering edit mode

7.2 years ago

MAPK ★ 2.1k

I am trying to understand the impact of CNVs on allele frequency. Suppose I have two cohorts:

A. 100 germline samples and 10 unmatched cancer samples.
B. 100 cancer samples and 10 unmatched germline samples.

Assuming there were cases of CNVs occurred in these cancer samples, I want to understand the impact of specific minor alleles (those that are too rare and too common) in both the cohorts. I am thinking the frequency of minor alleles that are too rare (say maf <1%) is going to increase in cohort A, whereas in cohort B (given the same cancer subtype) the frequency of minor alleles that are too common (say maf >10%) is going to increase. Can someone please correct me if this is not the case. Thanks for clarifying.

CNV maf alle • 2.0k views

ADD COMMENT • link 7.2 years ago by MAPK ★ 2.1k

1

Entering edit mode

For me, it is very hard to understand the question.

1) Are you assuming you also have SNP data and the allele frequency you are talking about is the SNP MAF? Or something else?

2) Are you assuming that your CNV are in cancer tissue?

3) Are you assuming you CNV are preferentially copy losses?

Some more details would help

ADD REPLY • link 7.2 years ago by Fabio Marroni ★ 3.0k

0

Entering edit mode

Hi Fabio,

Yes, the maf I am referring to is from SNP data (multigenome VCF file).
Yes the CNVs are assumed to be in cancer.
Not sure why you specifically mentioned copy losses, but I am referring to copy number variation potentially altering the allele frequency (I think it should be gain only). If there is loss of region there wouldn't be SNPs marked for that region in the VCF file (given no indels are included in the VCF file), isn't it so?

The reason I am asking this question is because I want to extract the SNPs that are not (or least) impacted by the CNVs effect in the cohort (i.e. from the multigenome VCF file).

ADD REPLY • link 7.2 years ago by MAPK ★ 2.1k

0

Entering edit mode

Not sure if I understood everything, but:

If you have a copy gain in a SNP in cancer, then this will probably lead to an increase of MAF in cohort B. If you have a copy loss in cancer, you will likely have a decrease of MAF in cohort B. I don't understand the link with being rare or not.

Do you have the CNV data?

ADD REPLY • link 7.2 years ago by Fabio Marroni ★ 3.0k

0

Entering edit mode

No, I don't have CNV data. I think the best I can do to remove any SNPs from the VCF file that are potentially coming from CNVs is by excluding indels and any non-biallelic SNPs. What do you think?

ADD REPLY • link 7.2 years ago by MAPK ★ 2.1k