HI,
I used RSEM to produce TPM matrix for 100 RNA seq reads (row names= gene, columns are cell number) I get the following error, can anyone help me please!
> all <-read.table(file="tpm_matrix.xls",header=T)
Error in read.table(file = "tpm_matrix.xls", header = T) :
duplicate 'row.names' are not allowed
I changed the format of row names from "gene10000_Ermap" to "Ermap".
If your file is rather a text file but no MS xls, you can use
cut -f 1 tpm_matrix.xls | sort | uniq -c | sort -k1,1nr
to find the duplicated row names. If there are only a few, you'll curate that manually; if you have a lot, you can use e.g. awk to add a consecutive number to the name.This helped, Thank you so much