Entering edit mode
3.0 years ago
michael.flower.14
▴
200
I've read quite a few posts that suggest I can subset my VCF file for a region of interest using Tabix.
- Extract Sub-Set Of Regions From Vcf File
- Retrieve Subset Positions Vcf File
- Use tabix with a list of regions
However, when I try as below the output file is empty
VCF="/Volumes/Seagate Expansion Drive/temp/130iPSC_061118.snp.vcf.gz"
tabix -p vcf "$VCF"
tabix "$VCF" 15:31196055-31235311 > "$DIR"/vcf/sliced.vcf
tabix "$VCF" -R "$DIR"/source/regions.bed > "$DIR"/vcf/tabix.vcf
I've subsequently managed to get it working using vcftools, as follows:
vcftools --gzvcf "$VCF" --chr chr15 --from-bp 31196055 --to-bp 31235311 --recode --recode-INFO-all --out "$DIR"/vcf/sliced
But I'd still like to know how to use Tabix
you are also selecting a region 0 bases long with 1:17375-17375. you may also need to use chr1 instead of 1
chr15 worked for me, thanks!
You have disagreement between the chromosome identifiers in your examples: 15:31196055-31235311 in the example that doesn't work vs chr15 in the example that does work. Tabix works to select regions just like in your example, if the region identifiers are correctly specified.