Entering edit mode
9.7 years ago
kumbarov
▴
10
I am willing to bulk process the 1000GP Y-SNP data for some projects of mine. I've downloaded the ALL.chrY.phase3_integrated.20130502.genotypes.vcf file and I would like to extract the data for only some samples or remove some samples. What will be the easiest way? Even better, is there a readily available script to import this sort of data into a database? I am new to this, so any advice on working with this sort of data is welcome.
Note, curl not necessary here:
oh and I forgot to paste the hyphen '-' after the command (read stdin)
bcftools -v
I am using the version that comes with Ubuntu 14.04. I've downloaded and compiled the htslib and samtools source code and compiled it but I don't get a bcftools binary.[main] Unrecognized command.
The version of bcftools that comes with Ubuntu 14.04 is completely broken. I get segfaults all the time. I downloaded the source for htslib, samtools and bcftools from GitHub and compiled it. The above command works perfectly with the upstream version of bcftools.