how to separate a column to some different columns in R
2
0
Entering edit mode
20 months ago
sata72 • 0

I have a column of data (label) as see below and I want to separate in to some separate columns, this is an example of data. I want to separate column "label" to "treat" "sex" "line" and "id"? Can anyone help?

Thanks

  label          treat   sex    line    id
 sample10trf100  tr      f    100     10
 sample11trf100  tr      f    100     11
 sample12trf100  tr      f    100     12
 sample13trm104  tr      m    104     13
 sample14trf104  tr      f    104     14
 sample15ctf104  ct      f    104     15
 sample16ctf120  ct      f    120     16
 sample17ctm120  ct      m    120     17
 sample18ctf120  ct      f    120     18
 sample19ctm100  ct      m    100     19
 sample20ctf104  ct      f    104     20
R • 1.5k views
ADD COMMENT
0
Entering edit mode
20 months ago
Joydeep • 0

You can use separate_wider_position() from the tidyr package if all your labels are the same widths.

labels<- read_tsv("labels.txt")
pos <- c(6,2,2,1,3)
names(pos) <- c("samp","id","treat","sex","line")
separate_wider_position(label, widths = pos, cols = label)

This will give you the following 5 columns which you may rearrange any way you want:

# A tibble: 11 × 5
   samp   id    treat sex   line 

   <chr>  <chr> <chr> <chr> <chr>

 1 sample 10    tr    f     100  
 2 sample 11    tr    f     100  
 3 sample 12    tr    f     100  
 4 sample 13    tr    m     104  
 5 sample 14    tr    f     104  
 6 sample 15    ct    f     104  
 7 sample 16    ct    f     120  
 8 sample 17    ct    m     120  
 9 sample 18    ct    f     120  
10 sample 19    ct    m     100  
11 sample 20    ct    f     104  
ADD COMMENT
0
Entering edit mode
20 months ago
zx8754 12k

Looks like it is a fixed width file, use read.fwf:

read.fwf(file = textConnection("sample10trf100
sample11trf100
sample12trf100
sample13trm104
sample14trf104
sample15ctf104
sample16ctf120
sample17ctm120
sample18ctf120
sample19ctm100
sample20ctf104"),
widths = c(8,2,1,3),
col.names = c("sample", "treat", "sex", "line"))
#      sample treat sex line
# 1  sample10    tr   f  100
# 2  sample11    tr   f  100
# 3  sample12    tr   f  100
# 4  sample13    tr   m  104
# 5  sample14    tr   f  104
# 6  sample15    ct   f  104
# 7  sample16    ct   f  120
# 8  sample17    ct   m  120
# 9  sample18    ct   f  120
# 10 sample19    ct   m  100
# 11 sample20    ct   f  104
ADD COMMENT
0
Entering edit mode

Great! is it possible i have a label column along side generated columns in output?

ADD REPLY
0
Entering edit mode

Yes, set the col.names argument, edited the post.

ADD REPLY

Login before adding your answer.

Traffic: 2647 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6