I have some datasets that I'll like to extract complete sequenced data for some specific SNPs. In each directory contain the following files:
.dose.vcf.gz
.phased.vcf.gz
My questions are:-
- Which of these files above are relevant for the extraction?
- from which of these datatype above do I get to extract the particular SNPs i needed
Well, it would help a lot if you explained from where you obtained your data. Also, why do you have RNA- and ChIP-seq as tags?
Sorry, the RNA and Chip-Seq were not meant to be included -i'll remove them. The data are output from the program shapeit and minimac3. Hope that helps a bit :)
Helps a bit, yes! Have you read the manuals to see to what these files relate?
You can immediately view the contents of these files by using BCFtools (
bcftools view
). If you learn more about BCFtools, then, you can also find a way to extract your SNPs of interest.Hi Kevin, thank you for the answer. I already have the bcftools installed but I really do not know where exactly to look in the data I posted for all the Sequence data for specific SNPs. If you have idea, I will be highly grateful. Thanks