Hi everyone,
I am trying to make a consensus from a aligned bam file. This is my workflow:
bcftools mpileup -Ou -f reference_HG19.fa aligned.bam | bcftools call -mv -Ob -o calls.bcf
bcftools index calls.bcf
cat reference_HG19.fa | bcftools consensus calls.bcf > consensus.fa
However, the created consensus.fa file turned out to be identical with the reference_HG19.fa file. I looked at calls.bcf file with bcftools view, and it seems the following columns are all empty:
#CHROM POS ID REF ALT QUAL FILTER INFO FORMAT aligned.bam
When I look at calls.bcf with samtools view I get the error "Aborted".
Does anyone have an idea how to solve this issues? Thank you very much!
Robert
Hello and welcome to biostars RobertUt ,
Please use the formatting bar (especially the
code
option) to present your post better. I've done it for you this time.Thank you.
Are you sure that you will have any variants in your
aligned.bam
?AFAIK
samtools view
cannot readbcf
files.bcftools view
would be what you are looking for.fin swimmer
Dear fin,
Thank's for your suggestions.
I just checked the bam file with samtools tview and found plenty of variants.
By the way, the only warning/error I get is about ploidy and a chromosome, which is missing in my bam file but present in the reference fasta.
Any other suggestions?
Please use
ADD COMMENT/ADD REPLY
when responding to existing posts to keep threads logically organized.How did you produce your bam file? Please show us the exact command used.
The bam file wasn't generated in our on lab but was downloaded from:
https://www.ebi.ac.uk/ena/data/view/PRJEB3371
It's a whole genome sequencing file, generated with Illumina HiSeq 2000. I am also suspecting the bam file to be the problem, because the fasta worked fine in previous commands. Chromosome notation is consistent between bam and fasta.
To which genome was the BAM aligned (check header)? Which genome ref are you using?
Just to humour me, could you try:
...and:
Hello,
you have to make sure that the
bam
files are coordinate sorted before doing variant calling. At least the first sample isn't.fin swimmer
Hi fin,
the bam file I'm using is coordinate sorted.
Robert