Entering edit mode
6.2 years ago
dthaper
▴
10
Hello,
I am conduced chip-seq for proteins that are part of a complex.
Peak1 - AR
Peak2 - EZH2
Peak3 - pEZH2
From the chip seeker analysis pipeline PMID:28416945, I was able to get to a point in the pipeline that spits out a venn diagram showing common peaks between the 3 chip-seqs. Venn Diagram
I wanted to extract the ID of the genes in the overlapping regions. This I what I have so far:
files <- list(peak1 = "AR.bed", peak2 = "EZH2.bed", peak3 = "pEZH2.bed")
peakAnnoList <- lapply(files, annotatePeak, TxDb=txdb,
tssRegion=c(-5000, 5000), verbose=FALSE)
genes = lapply(peakAnnoList, function(i) as.data.frame(i)$geneId)
vennplot(genes)
Is there another way to export out the "genes" data as a excel or csv? It presents as a "Large list" in the environment. I tried the following command and it didn't work as the lengths of the 3 lists isn't the same.
df4 <- data.frame(as.data.frame(genes))
write.table(df4, "overlap.xls", quote=FALSE, row.names=TRUE, sep="\t")
Thanks in advance for any help!
No need to call data.frame twice, this should do:
df4 <- data.frame(genes)
, please provide example output ofhead(df4)
orstr(df4)
. Just adding ".xls" doesn't make it Excel file, it will still output as text file. I am guessingwrite.csv(df4, "overlap.csv")
should work.Thanks for the correction on the double call! Unfortunately it gives the same error:
Error in (function (..., row.names = NULL, check.rows = FALSE, check.names = TRUE, : arguments imply differing number of rows: 13472, 38622, 39500
I'm unable to assign df4 with the contents of "genes"
I see, we are getting error because the length of sets are different. Maybe try this:
Or to get list of overlapped genes, try:
If these are not what you need, please provide expected output.