How to calculate maf for a variant for case and control
0
0
Entering edit mode
4.4 years ago
MAPK ★ 2.1k

I have a number of individuals for project A to L carrying one particular variant (rsXXX). There are homozygous reference and heterozygous only--no homozygous alternate for this variant in my cohorts. I would like to calculate the MAF for cases and control for the cohorts (and also for cases and controls), but I am not very familiar with the calculation methods. Could someone please help me calculate this.

    cohort <-   structure(list(Project = c("A", "B", "C", "D", 
"E", "F", "G", "H", "I", "J", "K", 
"L"), Homo_Ref_Total_Individuals = c(836L, 1666L, 209L, 16L, 929L, 841L, 252L, 
1493L, 568L, 44L, 190L, 2L), Homo_Ref_CASES = c(527, 993, 0, 0, 471, 
226, 201, 1036, 0, 0, 0, 0), Homo_Ref_CONTROLS = c(191, 671, 209, 0, 
295, 615, 17, 326, 161, 0, 94, 0), Hetero_Total_Individuals = c(5, 10, 2, 0, 12, 
8, 6, 23, 1, 0, 1, 0), Hetero_CASES = c(2, 6, 0, 0, 5, 1, 4, 21, 0, 
0, 0, 0), Hetero_CONTROLS = c(3, 4, 2, 0, 5, 7, 0, 2, 1, 0, 0, 0)), class = "data.frame", row.names = c(NA, 
-12L))
MAF genetics variant • 1.5k views
ADD COMMENT
0
Entering edit mode

You know you could print(cohort) and paste that as a table instead of using dput - easier to eyeball that way. It would look like this:

>print(cohort)

   Project Hom_Ref_Total_Individuals Hom_Ref_CASES Hom_Ref_CONTROLS Het_Total_Individuals Het_CASES Het_CONTROLS
1        A                       836           527              191                     5         2            3
2        B                      1666           993              671                    10         6            4
3        C                       209             0              209                     2         0            2
4        D                        16             0                0                     0         0            0
5        E                       929           471              295                    12         5            5
6        F                       841           226              615                     8         1            7
7        G                       252           201               17                     6         4            0
8        H                      1493          1036              326                    23        21            2
9        I                       568             0              161                     1         0            1
10       J                        44             0                0                     0         0            0
11       K                       190             0               94                     1         0            0
12       L                         2             0                0                     0         0            0
ADD REPLY
0
Entering edit mode

I think that you should first resolve why Hom_Ref_CASES + Hom_Ref_CONTROLS does not equal Hom_Ref_Total_Individuals. I mean, how can you explain the data for D (which is easier for the brain because it is all zero values)? Perhaps 'Hom_Ref_Total_Individuals' is an incorrect label.

Otherwise, once you resolve the discrepancy, the minor allele frequency (MAF) calculation is literal as per the very term, i.e., the frequency of your less frequent [minor] allele, but we usually quote this frequency per cases and controls (i.e., X% in cases; Y% in controls).

Ultimately, your data, as presented, makes no sense.

ADD REPLY
0
Entering edit mode

Hi Kevin, It is because I have (Hom_ref_CASES ==2) + (Hom_Ref_CONTROLS==1) + (unknown== -9) = Hom_Ref_Total. Same with the Het_Total. So, now the question is do I have to sum the cases (hom+Het cases) and make it my cohort or should I just use (Hom_Ref_Total+ Het_Total) as my cohort? I am really confused how to calculate the maf with this data.

ADD REPLY

Login before adding your answer.

Traffic: 2030 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6