how to convert genetic distance matrix to dataframe and rank genotypes based on distance?
0
1
Entering edit mode
6.6 years ago

Hi to all I got genetic distance matrix based snp genotyping data of 10 individuals in matrix format and now I want convert it into data frame format like genotype1 in columnA and genotype 2 in column B and distance between them in column C and also rank them genotype wise for easy comparisons. Here I am attached my example data file for better understanding. I tried this with following code but I did not get exactly what I am expecting with this

d <- read.csv("ex.csv", head = TRUE, sep = ",")
f <- function(i) {
  c1 <- paste(colnames(d)[i],rownames(d)[i:nrow(d)])
  c2 <- d[i:nrow(d),i]
  return(cbind(c1,c2))
}
result=data.frame(do.call(rbind, lapply(1:10,function(i){
  cbind(paste0(colnames(d)[i],",",rownames(d)[i:nrow(d)]),
        d[i:nrow(d),i])})))
result
write.table(result,"results.csv")

Data:

https://www.dropbox.com/s/wvgxknwbnow2l5s/ex.xlsx?dl=0

I need your help in solving this problem and it saves lot of my time in comparing one genotype against another any help in this regard will be highly appreciable Thanks in advance With Regards,

R SNP • 2.9k views
ADD COMMENT
0
Entering edit mode

TL;DR: if your file is a csv file, provide a csv file as an example.

Long comment: I really appreciate your effort to provide a small, reproducible example. However, the file you linked on Dropbox is an Excel file, not csv. When I save the first spreadsheet as csv, this is what I get:

,S1,S2,S3,S4,S5,S6,S7,S8,S9,S10
S1,0,2,3,4,5,6,7,8,9,10
S2,1,0,4,5,6,7,8,9,10,11
[...]

Reading it with your code ( d <- read.csv("ex.csv", head = TRUE, sep = "," ) ) doesn't work as expected, as it reads S1, S2, S3, etc as values:

> head( d, n = 2 )
   X S1 S2 S3 S4 S5 S6 S7 S8 S9 S10
1 S1  0  2  3  4  5  6  7  8  9  10
2 S2  1  0  4  5  6  7  8  9 10  11

So one needs to add row.names = 1 to the code:

d <- read.csv("ex.csv", head = TRUE, sep = ",", row.names = 1 )

Likewise, I don't know if the results spreadsheet is the result you get when running the code, or the intended result you want. You should make clear with "this is what I get ..." and "however, this is what I want ...".

ADD REPLY
0
Entering edit mode

Dear H.Mon

Thanks lot for your reply and i am sorry inconvenience caused because of file format. I added new csv file fomat please have look on it, first sheet contains data i have and results sheet is what i would like to get after running code, https://www.dropbox.com/s/wo4tawgvhob476v/New%20Microsoft%20Excel%20Worksheet1.csv?dl=0

once again thanks lot for help regards

ADD REPLY

Login before adding your answer.

Traffic: 2229 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6