How to distinguish the haplotype network based on the sampling location?
1
2
Entering edit mode
2.8 years ago
amirandi1808 ▴ 20

I have 20 DNA sequences from 4 sampling location, 5 individual each location. I would like to make haplotype network using R studio. Based on the analysis, It was written that I have 8 haplotypes. However, I cannot give different color for each location, especially when there are sharing haplotype between locations.

Here is my script


library("ape")

library("pegas")

read.dna("FIKS YA.fasta", format="fasta") -> Naso

Naso

NasoHaps <- haplotype(Naso)

NasoHaps

NasoNet <- haploNet(NasoHaps)

pop <- rep(paste0("pop", 1:4), each = 5)

region <- rep(c("regA", "regB", "regC", "regD"), each = 5)

table(region, pop)

h <- haplotype(Naso)

h


d <- dist.dna(NasoHaps, "N")

nt <- rmst(d, quiet = TRUE)

nt

plot(nt)

This is the output figure

Could anyone help me how to distinguish each haplotype based on the sampling locations and give them different color based on sampling locations? please note: these instructions are ambiguos

pop <- rep(paste0("pop", 1:4), each = 5)

region <- rep(c("regA", "regB", "regC", "regD"), each = 5)
haplotype Ape network Pegas • 826 views
ADD COMMENT
0
Entering edit mode
2.8 years ago
1311703846 • 0

Hi amirandi, I write some R code using your variable names to answer your question. First, we need a hap.pie file which is used to set the location and colors for each pie.

> hap.pies <- with(  
> stack(setNames(attr(NasoHaps,'index'),1:length(attr(NasoHaps,'index')))),  
> table(hap=as.numeric(as.character(ind)),pop=samples[values,region]) )

here let me explain the codes, hap.pies is what we need to set the location or other information for haplotypes.

> attr(NasoHaps,index)

hap file generated by haplotype(Naso) contains the haplotype and related fasta sequences. The format is like: index1(haplotype 1) : sequence 1; sequence 2. sample file is a file where you put the region information for each fasta sequence( the order of sample file is the same as fasta file). it contains two columns: sequence id and region;

ADD COMMENT

Login before adding your answer.

Traffic: 2362 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6