Entering edit mode
6.6 years ago
paraskevopou
▴
20
Hi all!! I have a large vcf file and I want to create a subset one according to #CHROM field with a txt file (a list that contains #CHROM IDs of interest). I would like to keep the headers and the vcf format. Any ideas of how to do that? Thanks a lot! :)
Thanks a lot for the comment. Actually my vcf file contains SNPs called from transcriptomes. So, the #CHROM field contains a bunch of different "genes" around 26000. From these I want to extract according to #CHROM around 5000. This is why I asked if it is possible to be done by providing a list as a txt file with the desirable #CHROM names. This is how my prefixes in the #CHROM field look like. Moreover the headers do not have constant numbers but random.
You should be able to use a regions file with bcftools ( gringer's answer in the last link above).
Thanks a lot. the bcftools filter command with the -R <file.txt> option worked perfectly.