Question

extract a table a subset from table

0

Entering edit mode

8.9 years ago

Bioiris ▴ 10

Hi;

I have two tables with a large count table (A) and the second with a list of genes (B) I want to extract a table with the gene_id present in the array of genes (B)

table(A)
————————

gene_id       sample1  sample2  sample3 . ……..sample20
g1                23           0      1              3
g2                35           1      4              8
g3                4            6     70              0
g4                98          12      7              6
g5                60           2      8             12
.
.
.
g1907             13           0      0              0


table(B)
———————
gene_id
g1
g2
g3
g4
g5
.
.
.
g100

i need to get;

table(c)
————————

gene_id       sample1  sample2  sample3 . ……..sample20
g1                 23        0        1              3
g2                 35        1        4              8
g3                  4        6       70              0
g4                 98       12        7              6
g5                 60        2        8             12
.
.
.
g100                 1      100        5              0

if anyone can help me (I worked with R)!! thanks

R • 1.9k views

ADD COMMENT • link updated 8.9 years ago by Devon Ryan 105k • written 8.9 years ago by Bioiris ▴ 10

score 3 · Answer 1 · 2016-09-07

3

Entering edit mode

8.9 years ago

Devon Ryan 105k

tableC = tableA[which(tableA$gene_id %in% tableB$gene_id), ]

ADD COMMENT • link 8.9 years ago by Devon Ryan 105k

0

Entering edit mode

thank you It's work !! but, I lost all counts !!!!!!!!! all genes count in all tusses ==0

ADD REPLY • link 8.9 years ago by Bioiris ▴ 10

1

Entering edit mode

Could it be originating from the fact that gene_id columns might be factors and not characters? If that's the problem, the following might solve it.

tableA$gene_id = as.character(tableA$gene_id)

tableB$gene_id = as.character(tableB$gene_id)

tableC = tableA[which(tableA$gene_id %in% tableB$gene_id), ]

ADD REPLY • link 8.9 years ago by Noushin N ▴ 620

0

Entering edit mode

same problem counts==0

ADD REPLY • link 8.9 years ago by Bioiris ▴ 10

0

Entering edit mode

Check to see if that's not in fact the correct result then. The command is correct.

> tableA = data.frame(gene_id=as.character(sprintf("g%i", 1:10)), sample1=rnorm(10), sample2=rnorm(10))
> tableB = data.frame(gene_id=as.character(c("g2", "g4", "g10", "g1")))
> tableA[which(tableA$gene_id %in% tableB$gene_id), ]
   gene_id    sample1     sample2
1       g1 -1.7334490  0.09335744
2       g2  0.6558662 -0.93235797
4       g4 -1.4080928  0.78908129
10     g10 -0.1292306  1.59782719

ADD REPLY • link 8.9 years ago by Devon Ryan 105k