extract a table a subset from table
1
0
Entering edit mode
8.2 years ago
Bioiris ▴ 10

Hi;

I have two tables with a large count table (A) and the second with a list of genes (B) I want to extract a table with the gene_id present in the array of genes (B)

table(A)
————————

gene_id       sample1  sample2  sample3 . ……..sample20
g1                23           0      1              3
g2                35           1      4              8
g3                4            6     70              0
g4                98          12      7              6
g5                60           2      8             12
.
.
.
g1907             13           0      0              0


table(B)
———————
gene_id
g1
g2
g3
g4
g5
.
.
.
g100

i need to get;

table(c)
————————

gene_id       sample1  sample2  sample3 . ……..sample20
g1                 23        0        1              3
g2                 35        1        4              8
g3                  4        6       70              0
g4                 98       12        7              6
g5                 60        2        8             12
.
.
.
g100                 1      100        5              0

if anyone can help me (I worked with R)!! thanks

R • 1.7k views
ADD COMMENT
3
Entering edit mode
8.2 years ago
tableC = tableA[which(tableA$gene_id %in% tableB$gene_id), ]
ADD COMMENT
0
Entering edit mode

thank you It's work !! but, I lost all counts !!!!!!!!! all genes count in all tusses ==0

ADD REPLY
1
Entering edit mode

Could it be originating from the fact that gene_id columns might be factors and not characters? If that's the problem, the following might solve it.

tableA$gene_id = as.character(tableA$gene_id)

tableB$gene_id = as.character(tableB$gene_id)

tableC = tableA[which(tableA$gene_id %in% tableB$gene_id), ]

ADD REPLY
0
Entering edit mode

same problem counts==0

ADD REPLY
0
Entering edit mode

Check to see if that's not in fact the correct result then. The command is correct.

> tableA = data.frame(gene_id=as.character(sprintf("g%i", 1:10)), sample1=rnorm(10), sample2=rnorm(10))
> tableB = data.frame(gene_id=as.character(c("g2", "g4", "g10", "g1")))
> tableA[which(tableA$gene_id %in% tableB$gene_id), ]
   gene_id    sample1     sample2
1       g1 -1.7334490  0.09335744
2       g2  0.6558662 -0.93235797
4       g4 -1.4080928  0.78908129
10     g10 -0.1292306  1.59782719
ADD REPLY

Login before adding your answer.

Traffic: 2621 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6