Hello there,
I am trying to transform a .bam file with the length of a genic region into .fasta format, as I've been reading, y may have to transform it into .vcf file to see the variant calls, and then create a consensus using bcf tools.
the pipeline seems very straightforward when transforming into vcf format, first I need to run mpileup from samtools to the original .bam file, then call command with bcftools, to call the variants.
Now, my problem comes when I run the mpileup, I ran the next argument:
samtools mpileup -g -f chr19ref.fa filename.bam > desiredfile.bcf
it actually gave me a desiredfile.bcf
, but on the terminal, it gave me loads of lines saying:
[E: :faidx_fetch_seq] The sequence "6" not found
this message filled the screen, and at the end I got a .bcf file, which when converting to .vcf gave an empty document, I guess there's an issue with the conversion to .bcf, but I can't find out what it is, it has something to do with the warning message I describe, can someone help me with that?
Greetings
Hello everybody, i already solved the problem that i had. as Pierre writes lines bellow, my problem was that the column on my .bam file described the chromosome number as "6", and the .fasta reference file i provided had the "chr6" notation, thus ,sending the error described in my original post. I tried to realize the hack i was adviced, but as it seems, it is not possible (as far as i know), to create a symbolic link running a virtual machine, which was my case, so i dug out into some forums and found the next solution, which allows you to write whatever script you want in the column of the chromosome number. In this case i just added the preffix "chr", which was lacking:
samtools view -H strand.bam | sed -e 's/SN:SN:chr/' | samtools reheader - your_troubled_file_name.bam > output_file_name.bam
i hope it helps you out people with similar problems.
Cheers.