Question

Change sequence ID in fastq file generated by bcftools mpileup

0

Entering edit mode

2.1 years ago

antoine.fauchois92 ▴ 20

Hi everobody !

I'm currently work on a HHV8 genetic study and I face to an issue with my bcftools command. Indeed, I want to generate consensus sequences thanks bcftools mpileup command and bam files. However, all ID get the ID of the reference sequence. You can see bellow the code used to do this job:

for patient in patients:
    input_file = os.path.join("Queries_seq/bam_files", patient + ".bam")
    output_file = os.path.join("Queries_seq/consensus/fastq", patient + ".fastq")
    os.system(f"bcftools mpileup -f Ref_seq/HHV8_ref.fasta {input_file} | \
        bcftools call -c | \
        vcfutils.pl vcf2fq > {output_file}")

So, if I do the linux command bellow on a randomly chosen file, I get:

sed '1,1p' Queries_seq/consensus/fastq/1G_S15.fastq
@NC_009333.1

This is the ID of my reference sequence. Can you help me to fix it ?

Best regards

bcftools • 770 views

ADD COMMENT • link updated 22 months ago by Ram 45k • written 2.1 years ago by antoine.fauchois92 ▴ 20

2

Entering edit mode

Bcftools annotate has a function rename-chr https://samtools.github.io/bcftools/bcftools.html#annotate.

You can pipe that after the bcftools call. ... bcftools call -c | bcftools annotate --rename-chrs myrenamefile.txt | ...

ADD REPLY • link 2.1 years ago by barslmn ★ 2.4k