Hi! I have tried making a dendrogram from a VCF file that contains SNP data for 20 samples. First, I tried the SNP relate software, but it was excluding all the SNPs
SNP pruning based on LD:
Excluding 0 SNP on non-autosomes
Excluding 78,187 SNPs (monomorphic: TRUE, MAF: 0.1, missing rate: 0.1)
Working space: 20 samples, 0 SNP
using 1 (CPU) core
sliding window: 500,000 basepairs, Inf SNPs
|LD| threshold: 1
method: composite
0 markers are selected in total.
Then I have tried the SNPphylo package but it is also based on SNPrelate and I get the same results. Does anyone know a quick solution to getting a dendrogram from a vcf file?
I have also tried making it a fasta file with gatk and tried FastTree but I get this error:
FastTree Version 2.1.10 Double precision (No SSE3)
Alignment: trial.fasta
Nucleotide distances: Jukes-Cantor Joins: balanced Support: SH-like 1000
Search: Normal +NNI +SPR (2 rounds range 10) +ML-NNI opt-each=1
TopHits: 1.00*sqrtN close=default refresh=0.80
ML Model: Jukes-Cantor, CAT approximation with 20 rate categories
Wrong number of characters for 1: expected 8477918 but have 55489 instead.
This sequence may be truncated, or another sequence may be too long.
Also, I don't mind using fasta aglinment files but can anyone explain me how I can get a tree from whole genome sequence files (I have bam files)?
Thanks
With a VCF file, you could input it to PLINK, where you could then perform IBS clustering. The output would simply be read into R and then generated into a tree. https://www.cog-genomics.org/plink/1.9/strat
Is there a tutorial on how to read the files in R and generate them into a tree? Thanks