Manhattan plot for TWAS results
2
0
Entering edit mode
10 months ago
rheab1230 ▴ 140

Hello everyone,

I am using PrediXcan to do generate gene prediction model and use them to do association test between cad whole blood GWAS summary statistics and predicted gene expression matrix to identify genes that are associated with the phenotype.

My association file look like this:enter image description here

How can I generate a Manhattan plot for these genes and show that some gene which are above the cut off line are significantly associated with the phenotype.

Basically a plot like this:

enter image description here

dput(file[1:2,1:5])
structure(list(gene = c("ENSG00000150938.5", "ENSG00000171055.10"
), gene_name = c("CRIM1", "FEZ2"), zscore = c(4.1906976198774, 
-3.97075348660666), effect_size = c(0.738149909514208, -0.451272838952875
), pvalue = c(2.78098076298391e-05, 7.1645681794841e-05)), row.names = 1:2, class = "data.frame")

I tried to use qqman package for this but it requires SNPs information which I don't have in my data.

manhattan(file)

In manhattan(file) :
  No SNP column found. OK unless you're trying to highlight.

enter image description here

TWAS GWAS manhattan-plot • 1.1k views
ADD COMMENT
0
Entering edit mode

Not sure about R since I'm not a big fan of it.

Python is super great and flexible for building your own visualizations.

See if these 2 resources would be helpful.

  1. https://python-graph-gallery.com/manhattan-plot-with-matplotlib/
  2. https://github.com/ShujiaHuang/qmplot
ADD REPLY
3
Entering edit mode
10 months ago
dthorbur ★ 2.6k

The RGraphGallery is a pretty good website for templates of common plots. A manhattan plot is just a specific type of scatter plot, which you can generate with the R package ggplot2 or just use base. Generally, I would say learn how to use ggplot2 for all plots since you'll have more control than package specific functions (which are often wrappers for ggplot2 functions anyway).

The key difference between a GWAS and a TWAS is the latter only has transcript data rather than SNPs. So you could use a gff file or something similar to get gene coordinates on the x (i.e., start site or middle of gene) and the p-value on the y, though it would look rather sparse in comparison to a GWAS manhattan.

ADD COMMENT
2
Entering edit mode
10 months ago
Corentin ▴ 610

Hi,

The manhattan() function from qqman requires you to specify column names if they do not correspond to the default names (in your case the "SNP" column needs to be replaced by "gene"):

manhattan(file, SNP = "gene")

However, the function will also need to know where to plot the genes: you will need the "CHR" and "BP" columns as well. You can get them with a biomart query (cf: https://stackoverflow.com/questions/45928683/get-gene-location-from-gene-symbol-and-id)

However, as dthorbur mentioned, you can always plot it manually.

Another point to keep in mind, is that the default "suggestive" and "significant" p-value thresholds in the manhattan() function were set with SNPs in mind. They are designed to correct for multiple testing. So, depending on how many transcripts you are testing in your TWAS you might want to change these thresholds. I never did TWAS myself, maybe you can check other TWAS papers to check how they set their thresholds.

ADD COMMENT
0
Entering edit mode

Thank you , I am able to plot it

ADD REPLY

Login before adding your answer.

Traffic: 2106 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6