Error in bcftools view -bvcg command
1
0
Entering edit mode
6.0 years ago

I am using this pipeline to make a vcf file. but m having problem at step 4. bcftools view -bvcg my-raw.bcf > my-var.bcf

it says -b is not a valid parameter.??? what to do ?

  1. Convert SAM to BAM for sorting

    samtools view -S -b my.sam > my.bam

  2. Sort BAM for SNP calling

    samtools sort my.bam my-sorted

  3. Run 'mpileup' to generate VCF format

    samtools mpileup -g -f my.fasta my-sorted-1.bam my-sorted-2.bam my-sorted-n.bam > myraw.bcf

  4. Call SNPs...

    bcftools view my.var.bcf | vcfutils.pl varFilter - > my.var-final.vcf

samtools bcftools vcf SNP • 5.3k views
ADD COMMENT
0
Entering edit mode

Step 1: check if you have a recent bcftools version

ADD REPLY
1
Entering edit mode

Step 2: if it is the recent version, check if the parameter used are still valid by typing bcftools view --help

It looks like hafiz.talhamalik is following an outdated tutorial in variant calling using samtools/bcftools. In the recent versions this looks like this:

$ bcftools mpileup -Ou -f my.fasta my-sorted-1.bam my-sorted-2.bam my-sorted-n.bam  | bcftools call -mv -Ov > variants.vcf

fin swimmer

ADD REPLY
0
Entering edit mode

yes my tutorial is old but the command you gave is giving an empty vcf file.

ADD REPLY
0
Entering edit mode

Hello again,

what do you mean by "empty". Just the vcf header or really a blank file?

fin swimmer

ADD REPLY
0
Entering edit mode

completely blank one. no header, nothing

ADD REPLY
0
Entering edit mode

Have you tried running them separately??

bcftools mpileup --redo-BAQ --min-BQ 30 --per-sample-mF --targets DP,AD -f ref.fasta  input.sorted.bam  | bcftools call --multiallelic-caller --variants-only  >out.vcf

set the --ploidy if required in the variant calling step.

ADD REPLY
0
Entering edit mode

yes i tried them separately but still empty file.

ADD REPLY
1
Entering edit mode

Hi, have you checked the sorted bam file? The command you mentioned for sorting is not correct. You can either redirect the ouput using > or can use the -o option. Also index the refernece fasta file with samtools in the same directory.

samtools sort my.bam > my-sorted.bam
samtools sort my.bam -o my-sorted.bam
ADD REPLY
0
Entering edit mode

yeah I tried that. sorted bam file is fine.

ADD REPLY
0
Entering edit mode

Could you please post the output of these commands:

  • $ samtools --version
  • $ bcftools --version
  • $ samtools view -H my-sorted.bam
  • $ samtools flagstat my-sorted.bam
  • $ cat ref.fasta.fai

fin swimmer

ADD REPLY
0
Entering edit mode
samtools --version
samtools 1.8
Using htslib 1.8

bcftools --version
bcftools 1.8
Using htslib 1.8

samtools view -H my-sorted.bam
@HD     VN:1.3  SO:coordinate
@SQ     SN:NC_003198.1  LN:4809037
@PG     ID:bwa  PN:bwa  VN:0.7.12-r1039 CL:bwa mem -M -t 8 ../../../SPAdes-3.11.1-Linux//bin/spades_output/sequence.fasta GCF_900185485.1_BL60006_cds_from_genomic.fna

samtools flagstats my-sorted.bam
[main] unrecognized command 'flagstats'

cat ref.fasta.fai
NC_003198.1     4809037 90      70      71
ADD REPLY
1
Entering edit mode
@PG     ID:bwa  PN:bwa  VN:0.7.12-r1039 CL:bwa mem -M -t 8 ../../../SPAdes-3.11.1-Linux//bin/spades_output/sequence.fasta GCF_900185485.1_BL60006_cds_from_genomic.fna

This looks like you're trying to align a fasta sequence and not fastq. AFAIK bwa can do this. But I'm not sure if bcftools can call variants from this file, because the base quality values are missing and if sequence.fasta contains only one sequence you only have one read per position in the alignment file.

So could you please tell us more about what you are trying to do?

fin swimmer

ADD REPLY
0
Entering edit mode

I have downloaded all the fasta sequence of salmonella from ncbi and now i want to compare snp's from those fasta files with my sequenced data's vcf file.

ADD REPLY
0
Entering edit mode

So you have downloaded assembled genomes rather than separate reads?

You can do variant calling from those but it would have been absolutely important if you mention things like this in your first post.

ADD REPLY
0
Entering edit mode

yes assembled genomes.

ADD REPLY
0
Entering edit mode

You have wasted a lot of your and our time by not telling us that earlier. I have added an answer to this thread.

ADD REPLY
0
Entering edit mode

oho. M really sorry for that..!

ADD REPLY
0
Entering edit mode

argh, it should be samtools flagstat (without s). I've edited it.

ADD REPLY
0
Entering edit mode
samtools flagstat
4908 + 0 in total (QC-passed reads + QC-failed reads)
4 + 0 secondary
0 + 0 supplementary
0 + 0 duplicates
4761 + 0 mapped (97.00% : N/A)
0 + 0 paired in sequencing
0 + 0 read1
0 + 0 read2
0 + 0 properly paired (N/A : N/A)
0 + 0 with itself and mate mapped
0 + 0 singletons (N/A : N/A)
0 + 0 with mate mapped to a different chr
0 + 0 with mate mapped to a different chr (mapQ>=5)
ADD REPLY
5
Entering edit mode
6.0 years ago

For calling variants from assembled genomes, relative to a reference, you need an entirely different approach.

Take for example a look here: https://github.com/lh3/minimap2/blob/master/cookbook.md#asm-var Using minimap2 for alignment and paftools for variant calling.

ADD COMMENT
0
Entering edit mode

oh okz.. Thanks for this. let me go through this. so in a nutshell you are suggesting samtool's aprroach is not good for this kind of analysis ?

ADD REPLY
2
Entering edit mode

You are aligning an assembly, so the aligner you used before is probably not the best for this job. Furthermore, you will have a coverage of 1 on most locations in the genome, which is of course not enough to call variants using pileup as you would for fastq reads.

ADD REPLY

Login before adding your answer.

Traffic: 1650 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6