How to create a Venn Diagram from data frame and get the list of common Genes for combination?
1
4
Entering edit mode
5.1 years ago
WUSCHEL ▴ 810

How to create a Venn Diagram from data frame and get the list of common Genes expressed in each combination?

I have a data frame of DEG, How can I create a Venn Diagram from data frame and get the list of common Genes?

Note: data frame has NA for some genotypes for some Genes, If all the genotypes are NA for a particular gene, that raw should be ignored.

Gene    Genotype1   Genotype2   Genotype3   Genotype4
    AT1G17400   NA  NA  NA  NA
    AT1G09420   NA  0.000800188 0.000116452 0.004017191
    AT1G50930   NA  NA  NA  NA
    AT1G65960   NA  NA  NA  NA
    AT1G09400   NA  NA  NA  NA
    AT1G09415   NA  NA  NA  NA
    AT1G74730   NA  NA  NA  NA
    AT1G75100   0.001639398 0.001578892 6.92E-05    NA
    AT1G75100   0.001639398 0.001578892 6.92E-05    NA
    AT1G75240   NA  5.60E-05    0.000235329 0.000162115
    AT1G14920   NA  NA  NA  NA
    AT1G14920   NA  NA  NA  NA
    AT1G65510   NA  NA  NA  NA
    AT1G75250   NA  NA  NA  NA
    AT1G54410   NA  0.000113869 1.25E-05    NA
RNA-Seq R • 9.4k views
ADD COMMENT
4
Entering edit mode
5.1 years ago
dsull ★ 6.9k

Question is very vague. What do these numbers mean? What do you consider expressed? Is anything that is not NA considered expressed?

Take a look at the Venn function here: https://www.rdocumentation.org/packages/gplots/versions/3.0.1.1/topics/venn

It gives you an example on how to create Venn diagrams.

I'll go off the assumption that everything non-NA is considered expressed. Say you have everything stored in a dataframe named data. You need to supply lists of non-NA genes that belong to each of your four groups: Genotype1, Genotype2, Genotype3, Genotype4:

require(gplots)
Genotype1 <- data[!is.na(data$Genotype1),"Gene"]
Genotype2 <- data[!is.na(data$Genotype2),"Gene"]
Genotype3 <- data[!is.na(data$Genotype3),"Gene"]
Genotype4 <- data[!is.na(data$Genotype4),"Gene"]
input <- list(Genotype1=Genotype1, Genotype2=Genotype2, Genotype3=Genotype3, Genotype4=Genotype4)
venn(input)
ADD COMMENT
1
Entering edit mode

We could improve data preparation for venn as:

input <- lapply(data[ -1 ], function(i) unique(data$Gene[ is.na(i) ]))
ADD REPLY
0
Entering edit mode

This is the answer I want :) Thank you dsull. BTW, How can I find what gene IDs went to each common category? How Can I export them out?

ADD REPLY
2
Entering edit mode

Well, if you want to know what genes belong to a certain category, say Genotype 1, you can simply print them out via:

print(Genotype1)

If you want to quickly see something like: genes that belong to Genotype1 and Genotype3 but don't belong to Genotype2 and Genotype4, again, you can subset your dataframe as follows: data[!is.na(data$Genotype1) & !is.na(data$Genotype3) & is.na(data$Genotype2) & is.na(data$Genotype4),"Gene"]

Basically, the exclamation point is the negation symbol so !is.na means the genes that are not-NA whereas is.na means the genes that are NA. The ampersand (&) means AND. Look into subsetting dataframes in R for more details.

ADD REPLY
0
Entering edit mode

Thanks a heap, dsull :) Appreciate.

ADD REPLY

Login before adding your answer.

Traffic: 1720 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6