Question

What does BAF mean?

1

Entering edit mode

7.5 years ago

Brian Bushnell 20k

I recently encountered the term "BAF" in a post on Biostars. I've looked up the definition, which is "B-Allele Frequency" (the extended definition is no more enlightening; one allele is A and the other is B). To me, that has no meaning other than being "Allele Frequency" with perhaps implicit assumptions about ploidy that are not generalizable, or maybe assumptions about the major/minor allele counts (from sequencing depth), or perhaps population frequencies, or maybe the reference allele. Can anyone explain where BAF is a useful term, or when one should use "BAF" instead of "AF"?

BAF variant-calling • 32k views

ADD COMMENT • link updated 5.0 years ago by Fernado Perez-Villatoro ▴ 50 • written 7.5 years ago by Brian Bushnell 20k

1

Entering edit mode

It looks like there is contention about what B-Allele frequency means. If you know the origin of the term, please post here!

ADD REPLY • link 7.5 years ago by Brian Bushnell 20k

3

Entering edit mode

5.0 years ago

Fernado Perez-Villatoro ▴ 50

I think this is a good explanation: https://www.ogt.com/resources/literature/768_cytosure_interpret_software_tips_and_tricks

Is principally used in CNV arrays.

"The B-Allele Frequency is a normalized measure of the allelic intensity ratio of two alleles (A and B), such that a BAF of 1 or 0 indicates the complete absence of one of the two alleles (e.g. AA or BB), and a BAF of 0.5 indicates the equal presence of both alleles (e.g. AB)."

It is a usefull measure when you are studying CNVs (LOHs or also SVs):

" detection of allelic imbalances such as those caused by duplications (e.g. AAB/BBA) or mosaic deletions in the sample. Such imbalances can be identified on a BAF plot by the presence of SNPs at frequencies between 0.5 and 0 or 1. For example, the theoretical BAF values of triploid regions (AAA, AAB, ABB or BBB) are 0, 0.33, 0.66 and 1 respectively."

ADD COMMENT • link 5.0 years ago by Fernado Perez-Villatoro ▴ 50

score 5 · Accepted Answer · 2017-05-26

5

Entering edit mode

7.5 years ago

Devon Ryan 104k

I've mostly seen BAF used in the context of SNP arrays, when there are two probes (an A probe and a B probe) covering a specific position. I think the A probe is generally the reference sequence (or was at the time the array was designed), so BAF and AF end up being the same, with the exception that SNP arrays are obviously limited in what they can detect.

ADD COMMENT • link 7.5 years ago by Devon Ryan 104k

2

Entering edit mode

Actually, the definition of what is allele A and B is a bit more complex tha assigning calling allele A to the reference allele. What is allele A and allele B is sequence dependent and it's not related to the population allele frequency (which would be the case if the reference allele was allele A). This has the benefit of making the B allele frequencies balanced between 0 and 1 and the global BAF plots almost symmetric around 0.5, which helps with the analysis.

ADD REPLY • link 7.5 years ago by bernatgel ★ 3.4k

0

Entering edit mode

To clarify - are you saying that the term "B-Allele Frequency" was developed to describe this naming system developed by Illumina to describe one of their products, and was not used prior to Illumina's TOP/BOT nomenclature?

ADD REPLY • link 7.5 years ago by Brian Bushnell 20k

0

Entering edit mode

I don't know if this is the case. I can say that B-allele frequency (BAF) is different than variant allele frequency (VAF) and that I've always seen it referred to the portion of signal coming from one of the alleles in SNP-array data. I do not know if the terminology was coined for that, but I've never seen it referring to anything else. Anyone else with additional info?

ADD REPLY • link 7.5 years ago by bernatgel ★ 3.4k

1

Entering edit mode

My bet is on it being an invention of Mendel, Fisher. Morgan, Hardy, Weinberg, Wright or one of the many others between 1880 and 1930 studying allelomorphs, linkage, and genetic inheritance - they used A and B for alleles. The fruit fly was studied a lot between 1910-1930... Then came the development of X-rays and electrophoresis. The Hardy-Weinberg paper stipulates N(A) > N(B) > N(C). Comparisons of B to A were used under different conditions (i.e. between different breeding cows). The terms "B allele frequency" and "frequency of the B allele" get hits in Google Scholar from the 1960s onwards. But it seems unlikely to have been coined recently...

ADD REPLY • link 4.3 years ago by Oliver Slay ▴ 60

0

Entering edit mode

Ah, that makes sense. Thanks!

ADD REPLY • link 7.5 years ago by Brian Bushnell 20k

3

Entering edit mode

http://cnvkit.readthedocs.io/en/stable/baf.html

I just noticed the description of BAF in CNVkit, which may be helpful for understanding.

Post as below:

In this context, the “B” allele is the non-reference allele observed in a germline heterozygous SNP, i.e. in the normal/control sample. Since the tumor cells’ DNA originally derived from normal cells’ DNA, most of these SNPs will also be present in the tumor sample. But due to allele-specific copy number alterations, loss of heterozygosity or allelic imbalance, the allelic frequency of these SNPs may be different in the tumor, and that’s evidence that one (or both) of the germline copies was gained or lost during tumor evolution. The shift in b-allele frequency is calculated relative to the expected heterozygous frequency 0.5, and minor allele frequencies are “mirrored” above and below 0.5 so that it does not matter which allele is considered the reference – the relative shift from 0.5 will be the same either way. (Multiple alternate alleles are not considered here.)

ADD REPLY • link 7.0 years ago by cc ▴ 30