heterozygosity during variant calling in a non-model organism such as Bubalus Bubalis
0
0
Entering edit mode
6.8 years ago
prasundutta87 ▴ 670

Hi,

I am not from a population genetics background, but am doing a feasibility test using a few organisms for a bigger aim of determining allele specific expression (ASE). I am performing variant calling on DNA-seq samples from 4 water buffaloes (Bubalus bubalis) to determine heterozygous sites in each samples using GATK 4.0.0.0. I also have RNA-seq data of the 4 animals that will allow me to take my ASE project forward.

How important is hets value (heterozygosity)? I have read- https://gatkforums.broadinstitute.org/gatk/discussion/8603/heterozygosity. Is using the default human heterozygosity value of 0.001 is correct for my non-model organism?

I also came across this website: https://gatkforums.broadinstitute.org/gatk/discussion/8603/heterozygosity

How can I determine heterozygosity for my species? Where or What should I look into to know about this value for my non-model organism?

An easy explanation will be appreciated as this would help me to understand the weightage of this parameter.

next-gen sequencing DNA-seq SNP • 1.5k views
ADD COMMENT
0
Entering edit mode

I do not know anything specific about Bubalus bubalis but the question to ask here is how you are performing the variant calls. I assume there's a reference genome? Are there multiple. An important question is whether previous studies generating Bubalus bubalis genomes have found any statistics on heterozygosity. You may also want to look into heterozygosity of other members of the Bovine family (i.e. cows.) I am not an expert but I would assume heterozygosity between family levels is probably fairly similar.

I would guess there is a much larger body of evidence generated for cows as opposed to water buffalo.

ADD REPLY
0
Entering edit mode

Hi Dylan..there is only one reference genome..and I am using that for variant calling..there is a recent paper: http://journals.plos.org/plosone/article?id=10.1371/journal.pone.0185220

In table 1, there is some data on heterozygosity, is it what I should look into? If yes, How should I interpret it?

I can check cows for such value..

ADD REPLY
0
Entering edit mode

I am not entirely sure how to interpret that particular paper's SNP frequencies but you may be able to compare that paper to this paper on human heterozygosity and get a good benchmark; https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3586588/

There also may not be a consensus in your field and the best option may be to perform your pipeline on a variety of heterozygosity levels.

ADD REPLY

Login before adding your answer.

Traffic: 1395 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6