Hi,
I am trying to generate consensus sequence from a bam file obtained after mapping SRA reads to a reference genome.
I used the following commands:
bwa mem ref.fasta SRR_1.fastq SRR_2.fastq > bwa.sam
samtools view -b -F 4 bwa.sam > bwa_aligned.bam
samtools index bwa_aligned.bam
I am not sure how to generate the consensus sequence that I have in mind. In case I don't explain this well. I made a diagram:
===========================================================>ref.fasta
- -- ---- ---- ----- --- --- - -- ------ -----
------ --- --- -------- --- -- ---- - --- -->SRR_reads_mapping
============== + ================ + ==================> consensus_sequence.fasta
Please let me know if you have any advice on this.
Cheers!!!
What do you mean by consensus sequence? The most frequent nucleotide in each position? The variants? There are plenty of tools that can give you the reading in each position, try bcftools mpileup for instance.
Thank you for your answers.
I meant obtaining the most frequent nucleotide in each position. From the mapped reads, I want to been able to obtain a consensus sequence in single fasta file. I am going to try
bcftools mpileup
.see below my answer thanks