Heatmap when having duplicated genes and samples
1
0
Entering edit mode
3.1 years ago
BioQueen ▴ 30

Hi! I have done a transcription factor inference analysis and I now want to display the result in a heatmap.

I have 3 columns: source, condition and score where the source is the gene names, the condition is the samples and the score is the enrichment score. The problem is that I have the enrichment score for each gene for each samples. Say I have the enrichment score for gene1 for sample1, gene2 for sample1 and gene3 for sample1, I also have the ES for gene1 for sample2, gene2 for sample2 and gene3 for sample2.

So my problem is that I'm not able to make the genes to rownames and the samples to colnames due to duplicates. Does anyone know how I can make a heatmap and display the enrichment score with samples on the "x-axis" and genes on the "y-axis"?

Thanks!

heatmap duplicated-genes-samples • 1.2k views
ADD COMMENT
0
Entering edit mode

Can you post first few lines from your table and also your expected output table?

ADD REPLY
0
Entering edit mode

I only have one table and it is that table I want to use for the heatmap visualisation. Here is an example, I don't know how to write it in this comment box, but I will try to make it understandable.

  1. source condition score
  2. gene1 sample1 0.001
  3. gene2 sample1 0.0003
  4. gene3 sample1 -0.0004
  5. gene1 sample2 0.003
  6. gene2 sample2 0.0001
  7. gene3 sample2 -0.0005

Dont mind the numbers in front of the table. Does this make it clearer?

ADD REPLY
6
Entering edit mode
3.1 years ago
kashiff007 ★ 1.9k

I guess you are using R. load the r package data.frame:

library(data.table)

Your data would look like:

>  df
        source  condition   score
1       gene1   sample1     0.4299397
2       gene2   sample1     0.4299397
3       gene3   sample1     0.4299397
4       gene1   sample2     0.2531551
5       gene2   sample2     0.2531551
6       gene3   sample2     0.2531551

Now use this command for "unmelt" your df:

dcast(df, source ~ condition, value.var = c("score")) 

The output would look like:

  source   sample1   sample2
1  gene1 0.4299397 0.2531551
2  gene2 0.4299397 0.2531551
3  gene3 0.4299397 0.2531551
ADD COMMENT
3
Entering edit mode

If an answer was helpful, you should upvote it; if the answer resolved your question, you should mark it as accepted. You can accept more than one answer if they all work.

enter image description here

ADD REPLY
0
Entering edit mode

Thanks! That worked :)

ADD REPLY

Login before adding your answer.

Traffic: 1596 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6