Dear All,
I have aligned the reads to the reference genome using Bowtie2. The generated sam was then converted to bam and sorted. The sorted file using the following command i tried to generate consensus sequence:
samtools mpileup -uf ref.fa aligned_sorted.bam | bcftools call -c | vcfutils.pl vcf2fq > aligned.fasta
However, a fasta file is generated. Viewing the tail of the file i get the follwoing:
HEHHHHHEHEEHEBBEBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBB
BBBBBBBEEEEEEHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHEHHHH
HHHHHHHHHHHHHHHEEEHHHEHHCCC@CCC@CEEEEEEEEEEECEEEEEEEEEEHHHHH
EHHHHHHHHHHHHHHHHHHHHHHHEHHKKKKKKKKKKKHEEEHECEEEEEEEECCEEEEE
EEEEEEEEEEEEEEEEHHHHHHHEKHHKKHKNNNNNQQQQNQQQNQNQQQQQQQTQTTQQ
TQQTQQQQQQNQQHNTTTTTTTTTTTQQQQNQQQQQQQQQQQQQQQQNQQQQQQQQQTQQ
QQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQNNNNQQQQKKKKK
KKKKKKKHHHHHHHHHHHHHHHHEEEEEEEEEEEEEEEEEEEEEEEBBBBBBBBBBBBEE
EEEEEEEEEEEEEEEEEEEEEEEEEEEECCCCCCCCCFFFFFFFFFFFFFFFFFFFFFFF
FFFFFFFFFFFFFFFFFFFFFFCCCCCCCC
What does this mean. I expected the file to have charcaters 'ATGC'
Thanks in advance..
Regards, sam
Is this base quality ?
Your last command is
vcfutils.pl vcf2fq
, so you get a fastq fileThanks Bastien for your reply. But why is base quality printed in a consensus fasta file. I would like to use this fasta file for further analysis.