I am trying to sum up the bp length of each category per genomic region in R, this is the dataframe
col1 col2 col3 col4 col5 col6
chr2 33739 34739 exon SINE 69
chr2 111204 112204 exon SINE 78
chr2 508422 509422 exon L1 152
chr3 701525 702525 intron LINE 84
chr3 701525 702525 intron LINE 112
chr3 863200 864200 UTR LINE 32
I want to sum up the length (col6) of each category in col5 per genomic region in col4, so I have 2 conditions for grouping data: col4 & col5 and the function is to sum up values of col6
I have tried this code from package "dplyr"
sum = df1 %>%
group_by(col4, col5) %>% summarise(df1, sum(col6))
the error is
Error in quickdf(.data[names(cols)]) : length(rows) == 1 is not TRUE
I have also tried grouping by only one column, but same error
also as an alternative, I am wondering, can I use the combination of rowsums()
and unique
for this? and if yes, what would be the code?
with R
with datamash:
I have this error