Question

Error in converting the column name into the rowname in R dataframe

0

Entering edit mode

2.9 years ago

KABILAN ▴ 130

I have the dataset like below,

exp.data <- structure(list(Fasta.headers = c(">O76070ups|SYUG_HUMAN_UPS Gamma-synuclein (Chain 1-127) - Homo sapiens (Human)", 
">Q06830ups|PRDX1_HUMAN_UPS Peroxiredoxin 1 (Chain 2-199) - Homo sapiens (Human)", 
">P06396ups|GELS_HUMAN_UPS Gelsolin (Chain 28-782) - Homo sapiens (Human);>Q3SX14 TREMBL:Q3SX14 (Bos taurus) Similar to Gelsolin", 
">P02768-1 SWISS-PROT:P02768-1 Tax_Id=9606 Gene_Symbol=ALB Isoform 1 of Serum albumin precursor;>P02768ups|ALBU_HUMAN_UPS Serum albumin (Chain 26-609) - Homo sapiens (Human)", 
">P02741ups|CRP_HUMAN_UPS C-reactive protein (Chain 19-224) - Homo sapiens (Human)", 
">P16083ups|NQO2_HUMAN_UPS Ribosyldihydronicotinamide dehydrogenase [quinone] (Chain 2-231) - Homo sapiens (Human)", 
">P05413ups|FABPH_HUMAN_UPS Fatty acid-binding protein, heart (Chain 2-133) - Homo sapiens (Human)", 
">P10636-8ups|TAU_HUMAN_UPS Microtubule-associated protein tau {Isoform Tau-F (Tau-4)} (Chain 2-441) - Homo sapiens (Human)", 
">P02788ups|TRFL_HUMAN_UPS Lactotransferrin (Chain 20-710) - Homo sapiens (Human)", 
">P06732ups|KCRM_HUMAN_UPS Creatine kinase M-type (Chain 1-381) - Homo sapiens (Human)"
), A1 = c(28.8484762528371, 28.5593417132562, 29.8009889375404, 
30.236308349045, 26.8634920403497, 29.2127142763584, 27.8652758954981, 
29.6272988793104, 30.3481913968282, 29.2592834274184), A2 = c(28.6976154934535, 
28.5259670144823, 29.7664700243508, 30.1817239029611, 26.8135256143612, 
29.0836758932669, 27.7993403923308, 29.5523797986127, 30.2273068960395, 
29.1884400603861), A3 = c(28.6907247615967, 28.4075268367718, 
29.945806961862, 30.1906689352863, 26.8775178221577, 29.1529637057232, 
27.848922423631, 29.5442401945692, 30.178874360984, 29.1770925073523
), B1 = c(21.4759289585346, 21.8116726154379, 21.0287288705184, 
21.6755517309807, 22.3711955096869, 20.5319556862522, 20.7294049265441, 
21.2970281676679, 20.0495639422741, 19.6827358066659), B2 = c(21.2438429926591, 
21.5540102900844, 21.1130287737854, 21.2063689577213, 22.5489466679164, 
20.622315219241, 20.8206334182144, 20.5006547822998, 20.1378005068828, 
19.7693579893069), B3 = c(21.9123487685507, 22.2790808524578, 
21.1321045998028, 22.3747058136723, 22.3639090145369, 20.6142641267144, 
20.8125049009101, 22.3532438314135, 20.1299385722048, 19.7616398971008
)), row.names = c(NA, 10L), class = "data.frame")

And I want to convert the first column as a rowname of the dataframe. I have tried many codes like,

cts <- exp.data
cts2 <- cts[,-1]
exp.data <- cbind (cts2, cts)
rownames(cts2) <- cts[,1]

rownames(exp.data) <- exp.data$Fasta.headers

and some other ways also. But at all time, I am getting the same error,

Error in `.rowNamesDF<-`(x, value = value) : 
  duplicate 'row.names' are not allowed
In addition: Warning message:
non-unique values when setting 'row.names': ‘>O00762ups|UBE2C_HUMAN_UPS Ubiquitin-conjugating enzyme E2 C (Chain 1-179, N-terminal His tag)- Homo sapiens (Human)’, ‘>O76070ups|SYUG_HUMAN_UPS Gamma-synuclein (Chain 1-127) - Homo sapiens (Human)’, ‘>P00167ups|CYB5_HUMAN_UPS Cytochrome b5 (Chain 1-134, N-terminal His tag) - Homo sapiens (Human)’, ‘>P00441ups|SODC_HUMAN_UPS Superoxide dismutase [Cu-Zn] (Chain 2-154) - Homo sapiens (Human)’, ‘>P00709ups|LALBA_HUMAN_UPS Alpha-lactalbumin (Chain 20-142) - Homo sapiens (Human)’, ‘>P00915ups|CAH1_HUMAN_UPS Carbonic anhydrase 1 (Chain 2-261) - Homo sapiens (Human)’, ‘>P00918ups|CAH2_HUMAN_UPS Carbonic anhydrase 2 (Chain 2-260) - Homo sapiens (Human)’, ‘>P01008ups|ANT3_HUMAN_UPS Antithrombin-III (Chain 33-464) - Homo sapiens (Human)’, ‘>P01031ups|CO5_HUMAN_UPS Complement C5 (C5a anaphylatoxin) (Chain 678-751) - Homo sapiens (Human)’, ‘>P01112ups|RASH_HUMAN_UPS GTPase HRas (Chain 1-189) - Homo sapiens (Human) [... truncated]

So, kindly give some solution for this issue.

colname R rowname data-frame • 2.2k views

ADD COMMENT • link updated 2.9 years ago by LDT ▴ 340 • written 2.9 years ago by KABILAN ▴ 130

0

Entering edit mode

You cannot set duplicate rownames to a dataframe, you should identify why you have duplicate rownames (in your example there is no error because all Fasta.headers are unique) and which information to keep

ADD REPLY • link 2.9 years ago by Basti ★ 2.1k

0

Entering edit mode

Yes @Basti, I have given only 10 rows of my dataset. And it contains duplicate rownames in the whole dataset. I got the answer the for this issue and I posted below.

ADD REPLY • link 2.9 years ago by KABILAN ▴ 130

score 0 · Answer 1 · 2022-07-28

0

Entering edit mode

2.9 years ago

KABILAN ▴ 130

This code is working for this problem,

exp.data <- exp.data1[,-1]
rownames(exp.data) <- make.names(exp.data1$Fasta.headers, unique = TRUE)

ADD COMMENT • link 2.9 years ago by KABILAN ▴ 130

score 0 · Answer 2 · 2022-07-28

0

Entering edit mode

2.9 years ago

LDT ▴ 340

You can also use tidyverse

exp.data %>% 
  remove_rownames() |> 
  column_to_rownames(var ='Fasta.headers')
head(exp.data)

I am not sure though if you want to do this only for the first row or all the rows?

ADD COMMENT • link 2.9 years ago by LDT ▴ 340

0

Entering edit mode

Yeah.. @LDT only for first row..

ADD REPLY • link 2.9 years ago by KABILAN ▴ 130

0

Entering edit mode

...and the name of the rest of the rows should be empty or the name of the first row?

ADD REPLY • link 2.9 years ago by LDT ▴ 340

0

Entering edit mode

not an empty. the only one row name for all values.

ADD REPLY • link 2.9 years ago by KABILAN ▴ 130

0

Entering edit mode

R data.frames do not accept the same rowname by defaults.
The only workaround that I have found is to make your data.frame to a matrix and then set the row.names. I guess this format will not work for you (?) If you post where you want to input the data we can find another work around

ADD REPLY • link 2.9 years ago by LDT ▴ 340