Hello,
using the --help
parameter or start bioinformatic applications without any option first is always a good idea.
$ tabix
Version: 1.9
Usage: tabix [OPTIONS] [FILE] [REGION [...]]
Indexing Options:
-0, --zero-based coordinates are zero-based
-b, --begin INT column number for region start [4]
-c, --comment CHAR skip comment lines starting with CHAR [null]
-C, --csi generate CSI index for VCF (default is TBI)
-e, --end INT column number for region end (if no end, set INT to -b) [5]
-f, --force overwrite existing index without asking
-m, --min-shift INT set minimal interval size for CSI indices to 2^INT [14]
-p, --preset STR gff, bed, sam, vcf
-s, --sequence INT column number for sequence names (suppressed by -p) [1]
-S, --skip-lines INT skip first INT lines [0]
Querying and other options:
-h, --print-header print also the header lines
-H, --only-header print only the header lines
-l, --list-chroms list chromosome names
-r, --reheader FILE replace the header with the content of FILE
-R, --regions FILE restrict to regions listed in the file
-T, --targets FILE similar to -R but streams rather than index-jumps
REGION [...]
tells you, you can define multiple regions at once. You can also define a file which holds the regions of interest using -R
.
So getting variants of chr1 to chr3 is straightforwar:
$ tabix -h input.vcf.gz chr1 chr2 chr3> output.vcf
To get the rest of the chromosome, we can create a file with all the chromosome but chr1 to chr3:
$ tabix -l input.vcf.gz | grep -vw "chr[123]" > regions.txt
And use than this file with tabix
$ tabix -h -R regions.txt > output2.vcf
fin swimmer
use
grep