freebayes produces gvcfs with shifted CHROM entries
0
0
Entering edit mode
3.5 years ago

Hi Everybody,

I ran freebayes on a bam-file to create a gvcf-file. There was no error during the SNP-call itself but when I tried to index the resulting file with tabix it gave me an error stating that there were unsorted positions in the file. In fact for some reason the very last record for each contig was moved onto the following contig as if the whole CHROM column was shifted by one position.

Here is an example of what it looks like:

CM009931.2      27754011        .       T       <*>     0       .       DP=1;END=27754012;MIN_DP=1
GQ:DP:MIN_DP:QR:RO:QA:AO        -0:0:0:-0:0:-0:0
CM009931.2      27754013        .       G       <*>     0       .       DP=1;END=27754014;MIN_DP=1
GQ:DP:MIN_DP:QR:RO:QA:AO        -0:0:0:-0:0:-0:0
CM009932.2      27754015        .       .       <*>     0       .       DP=0;END=27754200;MIN_DP=0
GQ:DP:MIN_DP:QR:RO:QA:AO        -0:0:0:-0:0:-0:0
CM009932.2      37      .       A       <*>     0       .       DP=1;END=38;MIN_DP=1    GQ:DP:MIN_DP:QR:
RO:QA:AO        -0:0:0:-0:0:-0:0

One can see based on the position values that the 3rd entry here should still be on Chrom CM009931, but for some reason here it is said to be on chrom CM009932. I have no idea how this could happen and would much appreciate any help on this issue.

This is the code I used to perform the SNP-call:

cat $BAMLST | parallel --no-notice -k -j $NSLOTS "echo {} && freebayes \
--fasta-reference $REF \
--ploidy 2 \
--report-genotype-likelihood-max \
--use-mapping-quality \
--genotype-qualities \
--use-best-n-alleles 20 \
--haplotype-length 3 \
--min-mapping-quality 40 \
--min-base-quality 30 \
--min-alternate-fraction $MINALTFRAC \
--min-alternate-total $MINALTN \
--min-coverage $MINCOV \
--use-reference-allele \
--gvcf \
--gvcf-chunk 1 \
--bam {} |\
bgzip -f -@ $NSLOTS -c /dev/stdin > $OUTPUTFOLDER/vcf-master/raw/{/.}.gvcf.gz \
&& tabix -fp vcf $OUTPUTFOLDER/vcf-master/raw/{/.}.gvcf.gz"

I used GNU parallel to run over a list of bamfiles in parallel, but in this case I had just one input-file to test it. I previously ran a SNP-call without the --gvcf and the --gvcf-chunk options on the same bam file without any problems, so the bamfile itself should be ok I guess.

gvcf SNP-call freebayes • 745 views
ADD COMMENT

Login before adding your answer.

Traffic: 1184 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6