Entering edit mode
8 months ago
ekirsch
•
0
Hello,
I am trying to run a ROH analysis using bcftools with the following code:
bcftools roh -G30 --AF-dflt 0.4 -O z -o ccgp_gatk_ALL.output.roh.gz GATK_bgzip_filtered_passerculus.vcf.gz
However, I am getting this error
[E::vcf_parse_format] Number of columns at JAKOOL010000017.1:1159910 does not match the number of samples (126 vs 147)
Has anyone experienced this error before or have any suggestions for fixing it?
I have tried looking at this problematic line and can't see any clear differences. I also tried using this command
zgrep 'JAKOOL010000017.1' GATK_bgzip_filtered_passerculus.vcf.gz | grep '1159910'
and found that the number of fields in lines containing '1159910' was 156, which lines up with my expected number of fields, so it seems like the issue is not just a simple column mismatch.
Thank you in advance!
it's a problem with your upstream process. You'd better fix it.
where do you think is the likely place this error would have occurred? ex. while making the original vcf file from the bam files or after this? the only thing I have done after creating the vcf is to filter it down to only include scaffolds that are above 1MB.