Hi everybody, I've tried to make some TSS profiles as follow:
Normalized Bam files using bamCoverage (output NGS974Norm bedgraph)
running ChiPseeker in bioconductor:
reads <- import.bed(con="NGS974Norm") print(reads) cov <- coverage(reads) peakAnno <- annotatePeak(reads, tssRegion=c(-500, 2000), TxDb=TxDb.Hsapiens.UCSC.hg19.knownGene, annoDb="org.Hs.eg.db")
But when it finish reports the following error: Error in rsqlite_send_query(conn@ptr, statement) : out of memory
any ideas?
How big is your dataset ? How much RAM do you have on your computer ? Have you tried running annotatePeak on only a few genes and not the full annotation (TxDb.Hsapiens.UCSC.hg19.knownGene) ?
I think this is not the problem because I have Memory 31.1GiB Processor Cor i7-6700 CPU @ 3.40GHz × 8
I need to run it woth all the TSS of all ensembl protein coding transcripts. any Idea?
When exactly do you get the error ?
After importing the bed file or when running annotatePeak ? If it is the later, at which step does it crash ? Loading peak file ? calculating distances ?
Is just after "adding gene annotation... " could it be the file?
That's weird...You can always try to test the code on smaller datasets (only 1 chrom of the bedgraph file or only 1 gene of the annotation for instance), but honestly, I don't know what is going on.
I've solved the problem, ChIPseeker need a file with 6 columns, not with 4 as bedgraph files report.
can you send me a sample file (e.g. first 20 rows of your bedgraph file) for testing?
Hi, yes it this:
"chr1" 0 10000 0 1 "." "chr1" 10000 10050 55 2 "." "chr1" 10050 10100 91 3 "." "chr1" 10100 10150 70 4 "." "chr1" 10150 10200 71 5 "." "chr1" 10200 10250 68 6 "." "chr1" 10250 10300 51 7 "."" "chr1" 10450 10500 14 11 "."
It would be better to send an attached file instead of pasting file content.
I don't think it's the number of column issue. I tested and it actually works fine with only 4 columns.
A thought that comes to my mind: Just because your computer has 32 GB RAM does not mean that 32GB Ram are also available to R. So it might be useful to just go back to the original error msg and try to investigate. Also the error msg talk about rsqllite, so maybe this module ran out of memory? As mentioned before, I'd try it with less data see if that helps, also monitor the memory usage when you start the process. Sorry can not be more specific than that. .... but 'all protein coding transcripts in ensembl' ... that sounds like something that could push any machine to its limits.
(posted twice sorry)