Entering edit mode
5.2 years ago
APJ
▴
40
Hi,
I have a tibble which looks like
head(TPM_a0)
# A tibble: 6 x 3
depmap_id gene_name expression
<chr> <chr> <dbl>
1 ACH-000956 TSPAN6 2.65
2 ACH-000429 TSPAN6 3.85
3 ACH-000857 TSPAN6 5.63
4 ACH-000783 TSPAN6 2.25
5 ACH-000963 TSPAN6 5.11
6 ACH-000812 TSPAN6 4.81
I would like to convert to a dataframe, where each row represents gene_name and each column is a depmap_id.
I tried spread()
function in R,
TPM_a2 <- TPM_a0 %>% spread(depmap_id, expression)
But ended up with the following error. Any ideas?
Error: Each row of output must be identified by a unique combination of keys.
Keys are shared for 64932 rows:
* 917895, 1262407
* 509207, 566047
* 1202487, 1208311, 1230683
* 1044847, 1050811, 1052435, 1052519, 1208703, 1211419
* 202075, 869539
* 293075, 1460703
* 264907, 1588831
* 503411, 569127
* 1568195, 1618959
Error indicates you must be having duplicate depmap_ids for same gene. For example: You must be having something like this:
So when you try spreading your data frame, it does not know which value to put for gene TP53 depmap_id ACH-000840. Your key-value pair needs to be unique. Check your values at row numbers in your error message to find out which key-value pairs are not unique.