Hi all,
I am trying to obtain a similar graph as shown in the figure with my data. My data includes Fst values for each SNP based the populations (pop1 vs pop2 etc) from admixture analysis. I would appreciate any help.
Hi all,
I am trying to obtain a similar graph as shown in the figure with my data. My data includes Fst values for each SNP based the populations (pop1 vs pop2 etc) from admixture analysis. I would appreciate any help.
this is not a complete answer, but it may help to empower you to create your own graphs.
i try to keep track of a lot of different visualization related tools (https://cmdcolin.github.io/awesome-genome-visualization/), but indeed, just hand-coding the solution in R is a good approach. I get my mind blown by ggplot2 capabilities
this blogpost was very enlightening for me for creating plots in R with proper e.g. chromosome labels along the bottom, and plotting data according to a 'cumulative' bp position
https://danielroelfs.com/blog/how-i-create-manhattan-plots-using-ggplot/
here is a quick approach at doing something for your problem
first generate 5 random regions (10Mbp long to make them easily visible) with bedtools random on command line, you'll have real data
bedtools random -g hg19.fa.gz.fai -n 5 -l 10000000 | sort -k1,1 -k2,2n > out1.bed
bedtools random -g hg19.fa.gz.fai -n 5 -l 10000000 | sort -k1,1 -k2,2n > out2.bed
bedtools random -g hg19.fa.gz.fai -n 5 -l 10000000 | sort -k1,1 -k2,2n > out3.bed
library(ggplot2)
library(tidyverse)
chrom_sizes <- read.table("hg19.chrom.sizes")
x1 <- read.table("out1.bed")
x2 <- read.table("out2.bed")
x3 <- read.table("out3.bed")
colnames(chrom_sizes) <- c("chr", "size")
colnames(x1) <- c("chr", "start", "end", "name", "score", "strand")
colnames(x2) <- c("chr", "start", "end", "name", "score", "strand")
colnames(x3) <- c("chr", "start", "end", "name", "score", "strand")
# sometimes read.table interprets chromosome names as integers, sometimes
# characters, force to characters
x1$chr <- as.character(x1$chr)
x2$chr <- as.character(x2$chr)
x3$chr <- as.character(x3$chr)
x1$score <- runif(n = nrow(x1), min = 1, max = 100)
x2$score <- runif(n = nrow(x1), min = 1, max = 100)
x3$score <- runif(n = nrow(x1), min = 1, max = 100)
data_cum <- chrom_sizes %>%
group_by(chr) %>%
summarise(max_bp = max(size)) %>%
arrange(-max_bp) %>%
mutate(bp_add = lag(cumsum(as.numeric(max_bp)), default = 0))
process <- function(df) {
df %>%
inner_join(data_cum, by = "chr") %>%
mutate(bp_start_cum = start + bp_add) %>%
mutate(bp_end_cum = end + bp_add)
}
data1 <- process(x1)
data2 <- process(x2)
data3 <- process(x3)
data1$id <- "id1"
data2$id <- "id2"
data3$id <- "id3"
df <- rbind(data1, data2, data3)
print(df)
axis_set <- data_cum %>%
group_by(chr) %>%
mutate(center = bp_add)
p <- ggplot(data = df, aes(color = score, xmin = bp_start_cum, xmax = bp_end_cum, ymin = 0, ymax = 1)) +
geom_rect() +
scale_x_continuous(label = axis_set$chr, breaks = axis_set$center) +
facet_grid(rows = vars(id)) +
theme(
axis.text.x = element_text(angle = 60, size = 8, vjust = 0.5),
axis.ticks.y = element_blank(),
axis.text.y = element_blank()
)
ggsave("out.png", p, width = 11, height = 3)
generates a figure like this
as you can see it's not really done, and not identical to yours, but might be a starting point. updated slightly with facet_grid after posting originally
example todos
Use of this site constitutes acceptance of our User Agreement and Privacy Policy.