Entering edit mode
4 months ago
davidmaimoun
▴
50
Hello,
I need to create a minimum spanning tree based on SNPs. I already run bowtie2 on a reference, and got sam, bam, sorted.bam, .sorted.bam.bai files.
And don't know how to continue. I need to create a distance matrix (.tsv) to Grapetree for displaying the tree. But don't have a clue how to do that.
I will glad to get help.
Thank you
I can only assuming you're using a genomic dataset given the statement, but I'm unsure. Grapetree looks like a lot of tedious work to get working for a genomic dataset. You'd call SNPs, construct an allele profile for each gene, create consensus sequences and align, and then make a metadata file.
If you don't need an MST specifically, I would recommend using something like RAxML for phylogenetic analyses. If you specifically need an MST, there appear to be functions in other software that I suspect have easier input requirements - see here.
Thank you very much for your help. Yes it's need to be MST. From now, I was creating MST after gene by gene analysis via chewbbaca (the output was an distance matrix for Grapetree visualization). My team want me to create the MST tree after SNP analysis. I assume I need to create the matrix after varaint calling, something like that:
bcftools mpileup -f reference.fasta input.bam -Ou | bcftools call -mv -Oz -o variants.vcf.gz
bcftools filter -i 'QUAL>20' variants.vcf.gz -Oz -o filtered_variants.vcf.gz
I read somewhere I need also create a genomic matrix in order to create a distance matrix, and from the distance matrix, use the link you sent me to create MST.
But its very new for me and I don't know if I am in the good path.