Plotting Allele Frequencies

1

Entering edit mode

5.3 years ago

kristina.mahan ▴ 170

I have UV mutagenized microorganisms and screened for improved phenotypes. Now I want to identify the causative variants. I've sequenced (200x coverage) a cultivar (mixed population with improved phenotype) and have vcfs with thousands of variants. I was wanting to do GWAS and make some Manhattan Plots (treating each sequencing read like and individual) but it doesn't seem like that will work here. Maybe I should just plot allele frequencies? Is that calculated by doing AF = AD (allele depth) / DP (read depth) from the VCFs? Or what is the best way to find the dominant variants in this mixed population? Any suggestions on how to move forward? Thanks!

variant-analysis • 2.2k views

ADD COMMENT • link updated 13 months ago by Ram 44k • written 5.3 years ago by kristina.mahan ▴ 170

1

Entering edit mode

If you convert the .vcf to .tsv with e.g. GATK VariantsToTable, you can add in the AF field if its missing then simply sort by value in Excel and you will have the top variants.

ADD REPLY • link 5.3 years ago by steve ★ 3.5k

0

Entering edit mode

usually when I plot allele frequency I do a histogram or box plot but will only show you the distribution of the allele frequencies, not the individual variants.

ADD REPLY • link 5.3 years ago by steve ★ 3.5k

Login before adding your answer.