Entering edit mode
4 months ago
davidmaimoun
▴
50
Hello,
I need to execute variant calling and draw a phylogenetic tree on a bacterial strain
The steps after getting the vcf files (nextflow):
norm="./${sample}.norm"
filter="./${sample}.filter"
bcftools norm -f ${ref} \${vcf_output} -Ob -o \${norm}.bcf
bcftools filter -Oz -e "QUAL<40||DP<10||GT!='1/1'||TYPE='INDEL'" -o \${filter}.vcf.gz \${norm}.bcf
bcftools index \${filter}.vcf.gz
bcftools consensus -f ${ref} \${filter}.vcf.gz > consensus.fasta
After that, I concatenate all the consensus fastas to one multifasta containing all my samples (35 samples)
My problem is when I try to execute multiple alignment via muscle, clustalo,mafft... all of these programs run hours (more than 5h) without creating output.
I tried also via Mega program and the program crash at some point. I suspect my multifasta too big for that (each sample have ~ 2 millions bp).
I would glad to get some help
Thank you