Manhattan Plot
0
0
Entering edit mode
10 months ago
Emily • 0

Hello,

I am trying to figure out why my manhattan plot has many horizontal lines. My data includes 36 individuals from 3 populations with a total of ~80,000 SNPs. To create the plot I used these commands in Rstudio:

library(pcadapt)
vcf.path="merge_filtered chrom.vcf"
meta.path="meta4.csv"
genos <- read.pcadapt(vcf.path,type=c("vcf"))
x <- pcadapt(input=genos,K=2)
plot(x,option="manhattan", plt.pkg = "ggplot", snp.info = TRUE)

I also made another manhattan plot using qqman and the same horizontal banding still occurred.

The only explanation I can really think of that may be causing this is too much genetic similarity among the individuals.

I've seen other plots that have a somewhat similar pattern, but not as severe as mine.

qqman pcadapt manhattan GWAS • 1.0k views
ADD COMMENT
1
Entering edit mode

I am new to doing Manhattan plots too so there should be someone more capable of answering this but the x-axis typically represents the genomic position (chromosome 1-22).

ADD REPLY
0
Entering edit mode

Moving this to a comment since it does not answer the question asked in the original post, which is about "banding" seen along X-axis.

ADD REPLY
1
Entering edit mode

Something seems a little odd here. What is the distribution of your p-values? -log10(0.05) is 1.3 and -log10(0.001) is 3. From the picture it looks like the majority of your data is significant. Though it's impossible to tell with this scale.

I also don't understand what your x-axis is meant to represent. Did you number your SNPs with MAF>0.05 from 1-~80,000? How are they ordered? I agree with Shane, despite being off topic, manhattan plots typically have genomic position on the x-axis.

My guess is the banding has something to do with rounding at such small numbers, or how the p-values are calculated (i.e., a p-value of 10^-60 is possible to calculate given the number of data points, whereas the p-value of 10^-61 is not given the data). I've seen similar structure in lots of genomic data in manhattan style plots.

ADD REPLY
0
Entering edit mode

Those horizontal lines basically mean that you only have a few values of significant p-value. If you have small sample size, binary outcome, you can easily imagine that will be the case if you have any statistics background.

There might be other issues in your data, like inflation.

ADD REPLY

Login before adding your answer.

Traffic: 2548 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6