Entering edit mode
6.9 years ago
blacktomato27
▴
70
Hi to all Good Afternoon
I have genotyping data (SNPs) of 200 plants and I want to select best representative lines corset (let say 50) from these 200 lines based SNP genotyping data. Previously I did it in power marker software but now it is not working anymore. I would like to know is there any other such software to do this corset analysis? or any other R packages available to do so. Any help in this regard is highly appreciated Thanks in Advance
How is it "not working" anymore?
Also, can you elaborate on 'corset'? The best hit for 'corset' in Google is the sexy attire (clothes).
Dear Kevin Good Afternoon
Thank you very much for your reply. Actually this power version was expired and not updated from last 10 years and i got his information at this (https://brcwebportal.cos.ncsu.edu/powermarker/). Coreset means for example if i want to sequence let say 100 genotypes (genotypes with some high pic SNPS), but because of cost issues if i want to reduce this number let say 40 genotypes (these genotypes called corset). Now i want to select these 40 samples out of this 100 samples based on SNP genotyping data. Basically this coreset 40 genotypes are best representatives of 100 genotypes. I hope i explained you well. Thanks lot for your help
Hi, I believe that I am doing a similar thing here: Produce PCA bi-plot for 1000 Genomes Phase III in VCF format In that tutorial, I reduce the millions of 1000 Genomes Phase III genotypes to just a collection of ~11,000 that can adequately represent the original data. I believe that the key part is 'Prune variants from each chromosome', where I essentially find collections of SNPs that are in linkage equilibrium and reduce these to just a single SNP that can summarise the collection based on the variance inflation factor (VIF) and MAF (minor allele frequency).
I believe that this is generally referred to as haplotype-tagging SNPs. In fact, a quick search reveals a previous Biostars thread where this method (that I used) is specifically mentioned for this purpose: Tagging SNPs in PLINK For reasons that I won't go into, the answer by chrchang523 should be highly regarded.
Let me know if that makes sense.