I have performed splicing QTL (sQTL) analysis using leafcutter and FastQTL to get the output:
Phenotype_id Number_of_variants_tested MLE_Shape1 MLE_shape2 Dummy variant_id Distance nominal_p-value slope first_permutation_p-value Second_permutation_p-value
1:3913:3996:clu_7241_NA 46 0.779368 10.5604 475.195 snp_1_2321 -1593 2.55028E-06 -0.704681 0.00789921 0.00345113
Now I am interested to plot SNPs along with intron coordinates to generate something like a Figure 2A
so I have subset the above file to get intron coordinates and SNP position and now the structure of my input file is like this:
chr Intron_start Intron_end SNP_position Pvalue
1 3913 3996 2321 0.00345113
1 3913 4001 4313 0.0116419
1 7447 7564 7464 0.0160019
1 7450 7564 7465 0.0348276
I am able to generate a plot using Intron start and snp position using this command:
Trans <- read.delim("combine_benj_1000_final_5", header=TRUE, sep="\t")
theme_set(theme_bw()) # pre-set the bw theme.
# Scatterplot
g <- ggplot(Trans, aes(SNP_position, Intron_start))
g + geom_point(aes(col=Pvalue)) +
labs(y="Gene Position",
x="SNP Position")
Generated Plot
But I am not sure how can I use Intron coordinates (start and end) to plot against SNP_position, as for an intron or gene I think we need to specify its start and end location to correlate it with SNP position.
Secondly, how can I plot based on chromosomes?
Here is Input trial dataset
Any help will be highly appreciated.
As per my understanding, I have performed the trans - eQTL for small RNA based on genes associated SNPs , where each chromosome has shown. I am providing you the code , hope you may find it helpful.
Read about
cut
, it will make your code 50% less.can you please also provide me input dataset?