consensus sequence from BAM file

0

Entering edit mode

7.1 years ago

Ric ▴ 440

Hi, I used samtools mpileup -uf chr01.fasta chr01.bam | bcftools call -c | vcfutils.pl vcf2fq > chr01-cns.fastq to make a consensus sequence from BAM file. However, it has used IUPAC characters. How would it be possible to choose a base with the most appearance at locus position rather than IUPAC?

Thank you in advance.

samtools bcftools vcfutils • 3.0k views

ADD COMMENT • link 7.1 years ago by Ric ▴ 440

0

Entering edit mode

What about heterozygous variants?

ADD REPLY • link 7.1 years ago by WouterDeCoster 47k

0

Entering edit mode

I do not care about them.

ADD REPLY • link 7.1 years ago by Ric ▴ 440

0

Entering edit mode

So what do you do if a position has 10 reads A and 10 reads C?

ADD REPLY • link 7.1 years ago by WouterDeCoster 47k

0

Entering edit mode

Good question. What would error correction for PacBio do in this case?

ADD REPLY • link 7.1 years ago by Ric ▴ 440

1

Entering edit mode

I think WoulterDeCoster's question was about, what do you do if you have a heterozygous site with a 50/50 distribution.

ADD REPLY • link 7.1 years ago by Gabriel R. ★ 2.9k

0

Entering edit mode

I do not know what to do with 50/50 distribution. What would error correction for PacBio do in this case?

ADD REPLY • link 7.1 years ago by Ric ▴ 440

0

Entering edit mode

Well I hope that any error correction software does not "correct" true het sites :-) Why can't you use a small script that picks an allele at random at UIPAC ambiguous codes?

ADD REPLY • link 7.1 years ago by Gabriel R. ★ 2.9k

0

Entering edit mode

Hi, I ended up to write a script.

ADD REPLY • link 7.1 years ago by Ric ▴ 440

0

Entering edit mode

it this mt or nuclear?

ADD REPLY • link 7.1 years ago by Gabriel R. ★ 2.9k

0

Entering edit mode

it is nuclear data..

ADD REPLY • link 7.1 years ago by Ric ▴ 440

Login before adding your answer.