Question

Allelic association tests for polyploid data

1

Entering edit mode

9.4 years ago

Russ ▴ 520

I would like to conduct an allelic association test using my Illumina generated NGS data. Briefly, I conducted targeted resequencing on a 1MB region of 100 horses. This was done by pooling DNA of 4-5 horses such that there were 24 indexed groups of 4-5 horses each. Variant calling, etc, was done using an assumed ploidy of 8 or 10 as appropriate.

My VCF files obviously appear as though I have polyploid organisms, which Plink v1.9 and vcftools do not support. However, the data is just formatted as polyploid, when in reality the allele frequencies are from diploid organisms - so I figure there should be someway to manipulate the data to reflect that.

Ultimately, I'd like to determine the allele frequencies for each SNP for both my cases and controls, and get it into a format that plink would be happy with, but I'm not sure how. Is there a tool, or an efficient pipeline that anyone could suggest? My searches have not turned up much...

snp horse polyploid next-gen plink • 2.5k views

ADD COMMENT • link updated 23 months ago by Ram 44k • written 9.4 years ago by Russ ▴ 520

0

Entering edit mode

Why wouldn't you de-multiplex to get 100 different files, one for each horse, and map them as normal diploids?

ADD REPLY • link 9.4 years ago by vivekbhr ▴ 700

0

Entering edit mode

Either my original post wasn't clear, or I'm missing something. DNA from 4-5 horses were pooled, and then all of the DNA in that pool was indexed with a single barcode, for a total of 24 indexed groups each containing several animals' DNA. I have no way of separating out an individual horse's reads (that I know of, at least!).

ADD REPLY • link updated 23 months ago by Ram 44k • written 9.4 years ago by Russ ▴ 520