Merge common elements in R
1
0
Entering edit mode
21 months ago
sansan96 ▴ 130

Hello everyone,

I have a list of differential genes (list1) and another list where I have the ID of the genes and the name (list2) and I want to name the genes from list1 taking the name from list2, how can I do this?

I will appreciate your support, thanks.

List1

Diff_lgn_vs_WT=read.table("lgn_vs_WT.txt", header = TRUE, sep="\t")
Diff_lgn_vs_WT<-as.data.frame(Diff_lgn_vs_WT)
head(Diff_lgn_vs_WT)

                X baseMean log2FoldChange     lfcSE      stat      pvalue        padj
1 Zm00001eb000090 83.74241      -1.073630 0.2899722 -3.702525 0.000213464 0.001506448
2 Zm00001eb000270 45.96920      -1.141293 0.2798026 -4.078921 0.000045200 0.000404480
3 Zm00001eb000280 10.95653      -1.537141 0.5425272 -2.833298 0.004607039 0.018931911
4 Zm00001eb000370 83.58504       1.739356 0.3392983  5.126333 0.000000295 0.000005260
5 Zm00001eb000540 41.87663      -1.064135 0.3750895 -2.837017 0.004553724 0.018768199
6 Zm00001eb000740 97.93517      -1.001488 0.2542629 -3.938789 0.000081900 0.000667588

List2

anot_MaizMine=read.csv("anot_MaizMine_limpia.csv",header = TRUE,sep=",")
head(anot_MaizMine)

                X                                    NAME
1 Zm00001eb000010 transcription termination factor MTERF2
2 Zm00001eb000020 E3 ubiquitin ligase complex SCF subunit
3 Zm00001eb000170                     MSC domain-containing protein
4 Zm00001eb000180                  Ribose-phosphate diphosphokinase
5 Zm00001eb000190 tRNA/rRNA methyltransferase (SpoU) family protein
6 Zm00001eb000200                                           Oleosin
R • 1.3k views
ADD COMMENT
0
Entering edit mode

Have you looked at the merge() function ?

ADD REPLY
0
Entering edit mode

If I understood correctly, you want to merge list1 and list2 by gene name. If so, make sure the gene names in both list1 and list2 are under a column that has the same name, for example, "gene_name" in both dataframes, then do:

final_list <- merge(list1, list2)

The function merge should automatically merge the two dataframes based on the columns with the same name.

ADD REPLY
0
Entering edit mode

What I really want is to look up the ids (column 1) from list1 in list2 and if found, return the name of those ids according to list2. List2 has all gene ID and description (NAME) and in list1 only a small number of genes which I don't know what name they have.

ADD REPLY
0
Entering edit mode

In that case, check out the match() function

ADD REPLY
0
Entering edit mode

Thanks so much.

ADD REPLY
1
Entering edit mode
21 months ago
jv ★ 1.8k

dplyr join functions can help with this, e.g.,

Diff_lgn_vs_WT <- dplyr::left_join(Diff_lgn_vs_WT, anot_MaizMine, by = "X")

This will add the NAME columns from anot_MaizMine to Diff_lgn_vs_WT where there is a a match in column X.

For more background on using join I like the following: https://statisticsglobe.com/r-dplyr-join-inner-left-right-full-semi-anti

ADD COMMENT
0
Entering edit mode

Great, if it worked for me. Thank you very much for your valuable help.

ADD REPLY

Login before adding your answer.

Traffic: 2654 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6