Our lab recently performed Methyl DNA Immunoprecipitation followed by Illumina. There are two experimental conditions, and each condition had the following preps (in 3 replicates):
- Input
- Methyl-cytosine
- Hydroxy-methyl-cytosine
So I have 36 fastq files, which I aligned to hg19 using bowtie.
I then moved onto the MEDIPS package to try and calculate some differential methylation, etc etc.
I chose a subset of the files to begin with, so now I only have 12.
Condition A-Methylation, Condition A-Input and Condition B-Methylation, Condition B-Input. I have attempted to reproduce the MEDIPS vignettes, but I continue to encounter the following error:
Error in 1:max_signal_index : argument of length 0
Here is the complete log:
A_input_1="/data/projects/MeDIP/results/hg19/A-Input-rep3.sorted.bam"
A_input_2="/data/projects/MeDIP/results/hg19/A-Input-rep2.sorted.bam"
A_input_3="/data/projects/MeDIP/results/hg19/A-Input-rep1.sorted.bam"
A_meth_1="/data/projects/MeDIP/results/hg19/A-Met-rep3.sorted.bam"
A_meth_2="/data/projects/MeDIP/results/hg19/A-Met-rep2.sorted.bam"
A_meth_3="/data/projects/MeDIP/results/hg19/A-Met-rep1.sorted.bam"
B_input_1="/data/projects/MeDIP/results/hg19/B-Input-rep3.sorted.bam"
B_input_2="/data/projects/MeDIP/results/hg19/B-Input-rep2.sorted.bam"
B_input_3="/data/projects/MeDIP/results/hg19/B-Input-rep1.sorted.bam"
B_meth_1="/data/projects/MeDIP/results/hg19/B-Met-rep3.sorted.bam"
B_meth_2="/data/projects/MeDIP/results/hg19/B-Met-rep2.sorted.bam"
B_meth_3="/data/projects/MeDIP/results/hg19/B-Met-rep1.sorted.bam"
library("MEDIPS")
library("BSgenome")
library(BSgenome.Hsapiens.UCSC.hg19)
BSgenome="BSgenome.Hsapiens.UCSC.hg19"
uniq=TRUE
extend=100
shift=0
ws=100
A_input = MEDIPS.createSet(file=A_input_1, BSgenome=BSgenome, extend=extend, shift=shift, uniq=uniq, window_size=ws)
Reading bam alignment A-Input-rep3.sorted.bam
Total number of imported short reads: 12623340
Extending reads...
Creating GRange Object...
Extract unique regions...
Number of unique short reads: 12325986
Calculating genomic coordinates...
Creating Granges object for genome wide windows...
Calculating short read coverage for genome wide windows...
A_input = c(A_input, MEDIPS.createSet(file=A_input_2, BSgenome=BSgenome, extend=extend, shift=shift, uniq=uniq, window_size=ws))
Reading bam alignment A-Input-rep2.sorted.bam
Total number of imported short reads: 13169744
Extending reads...
Creating GRange Object...
Extract unique regions...
Number of unique short reads: 12881682
Calculating genomic coordinates...
Creating Granges object for genome wide windows...
Calculating short read coverage for genome wide windows...
A_input = c(A_input, MEDIPS.createSet(file=A_input_3, BSgenome=BSgenome, extend=extend, shift=shift, uniq=uniq, window_size=ws))
Reading bam alignment A-Input-rep1.sorted.bam
Total number of imported short reads: 20094483
Extending reads...
Creating GRange Object...
Extract unique regions...
Number of unique short reads: 19571774
Calculating genomic coordinates...
Creating Granges object for genome wide windows...
Calculating short read coverage for genome wide windows...
A_meth = MEDIPS.createSet(file=A_meth_1, BSgenome=BSgenome, extend=extend, shift=shift, uniq=uniq, window_size=ws)
Reading bam alignment A-Met-rep3.sorted.bam
Total number of imported short reads: 2299270
Extending reads...
Creating GRange Object...
Extract unique regions...
Number of unique short reads: 2272099
Calculating genomic coordinates...
Creating Granges object for genome wide windows...
Calculating short read coverage for genome wide windows...
A_meth = c(A_meth, MEDIPS.createSet(file=A_meth_2, BSgenome=BSgenome, extend=extend, shift=shift, uniq=uniq, window_size=ws))
Reading bam alignment A-Met-rep2.sorted.bam
Total number of imported short reads: 107206
Extending reads...
Creating GRange Object...
Extract unique regions...
Number of unique short reads: 106813
Calculating genomic coordinates...
Creating Granges object for genome wide windows...
Calculating short read coverage for genome wide windows...
A_meth = c(A_meth, MEDIPS.createSet(file=A_meth_3, BSgenome=BSgenome, extend=extend, shift=shift, uniq=uniq, window_size=ws))
Reading bam alignment A-Met-rep1.sorted.bam
Total number of imported short reads: 18513790
Extending reads...
Creating GRange Object...
Extract unique regions...
Number of unique short reads: 18007716
Calculating genomic coordinates...
Creating Granges object for genome wide windows...
Calculating short read coverage for genome wide windows...
B_input = MEDIPS.createSet(file=B_input_1, BSgenome=BSgenome, extend=extend, shift=shift, uniq=uniq, window_size=ws)
Reading bam alignment B-Input-rep3.sorted.bam
Total number of imported short reads: 12297794
Extending reads...
Creating GRange Object...
Extract unique regions...
Number of unique short reads: 11982680
Calculating genomic coordinates...
Creating Granges object for genome wide windows...
Calculating short read coverage for genome wide windows...
B_input = c(B_input, MEDIPS.createSet(file=B_input_2, BSgenome=BSgenome, extend=extend, shift=shift, uniq=uniq, window_size=ws))
Reading bam alignment B-Input-rep2.sorted.bam
Total number of imported short reads: 12670679
Extending reads...
Creating GRange Object...
Extract unique regions...
Number of unique short reads: 12366829
Calculating genomic coordinates...
Creating Granges object for genome wide windows...
Calculating short read coverage for genome wide windows...
B_input = c(B_input, MEDIPS.createSet(file=B_input_3, BSgenome=BSgenome, extend=extend, shift=shift, uniq=uniq, window_size=ws))
Reading bam alignment B-Input-rep1.sorted.bam
Total number of imported short reads: 20020833
Extending reads...
Creating GRange Object...
Extract unique regions...
Number of unique short reads: 19456121
Calculating genomic coordinates...
Creating Granges object for genome wide windows...
Calculating short read coverage for genome wide windows...
B_meth = MEDIPS.createSet(file=B_meth_1, BSgenome=BSgenome, extend=extend, shift=shift, uniq=uniq, window_size=ws)
Reading bam alignment B-Met-rep3.sorted.bam
Total number of imported short reads: 19227859
Extending reads...
Creating GRange Object...
Extract unique regions...
Number of unique short reads: 18619291
Calculating genomic coordinates...
Creating Granges object for genome wide windows...
Calculating short read coverage for genome wide windows...
B_meth = c(B_meth, MEDIPS.createSet(file=B_meth_2, BSgenome=BSgenome, extend=extend, shift=shift, uniq=uniq, window_size=ws))
Reading bam alignment B-Met-rep2.sorted.bam
Total number of imported short reads: 5287187
Extending reads...
Creating GRange Object...
Extract unique regions...
Number of unique short reads: 5201859
Calculating genomic coordinates...
Creating Granges object for genome wide windows...
Calculating short read coverage for genome wide windows...
B_meth = c(B_meth, MEDIPS.createSet(file=B_meth_3, BSgenome=BSgenome, extend=extend, shift=shift, uniq=uniq, window_size=ws))
Reading bam alignment B-Met-rep1.sorted.bam
Total number of imported short reads: 20421851
Extending reads...
Creating GRange Object...
Extract unique regions...
Number of unique short reads: 19798624
Calculating genomic coordinates...
Creating Granges object for genome wide windows...
Calculating short read coverage for genome wide windows...
# create a coupling set based on ANY file
CS = MEDIPS.couplingVector(pattern="CG",refObj=A_meth[[1]])
Get genomic sequence pattern positions...
Searching in chr1 ...[ 2284470 ] found.
Searching in chr2 ...[ 2164335 ] found.
Searching in chr3 ...[ 1623646 ] found.
Searching in chr4 ...[ 1473930 ] found.
Searching in chr5 ...[ 1506454 ] found.
Searching in chr6 ...[ 1475569 ] found.
Searching in chr7 ...[ 1568666 ] found.
Searching in chr8 ...[ 1309135 ] found.
Searching in chr9 ...[ 1226821 ] found.
Searching in chr10 ...[ 1351291 ] found.
Searching in chr11 ...[ 1289987 ] found.
Searching in chr12 ...[ 1277218 ] found.
Searching in chr13 ...[ 803708 ] found.
Searching in chr14 ...[ 859779 ] found.
Searching in chr15 ...[ 873464 ] found.
Searching in chr16 ...[ 1097776 ] found.
Searching in chr17 ...[ 1155600 ] found.
Searching in chr18 ...[ 677214 ] found.
Searching in chr19 ...[ 1057376 ] found.
Searching in chr20 ...[ 717722 ] found.
Searching in chr21 ...[ 380444 ] found.
Searching in chr22 ...[ 578097 ] found.
Searching in chrM ...[ 439 ] found.
Searching in chrX ...[ 1246401 ] found.
Searching in chrY ...[ 217906 ] found.
Number of identified CG pattern: 28217448
Calculating genomic coordinates...
Creating Granges object for genome wide windows...
Counting the number of CG's in each window...
meth = MEDIPS.meth(MSet1=A_meth, MSet2=B_meth, CSet=CS, ISet1=A_input, ISet2=B_input, diff.method="ttest", MeDIP=TRUE, type="rms")
Calculating genomic coordinates...
Creating Granges object for genome wide windows...
Preprocessing MEDIPS SET 1 in MSet1...
Calculating calibration curve...
Performing linear regression...
Intercept: -0.0486489688317665
Slope: 0.0946696446852834
Calculating relative methylation score...
Estimate methylation...
Preprocessing MEDIPS SET 2 in MSet1...
Calculating calibration curve...
Performing linear regression...
Error in 1:max_signal_index : argument of length 0
I am on a workstation with 32 GB of RAM, plenty of hard drive space and processing power so I do not believe that it is a hardware issue.
Seems that this error was reported before:
http://comments.gmane.org/gmane.science.biology.informatics.conductor/52319
Solution might be:
MeDIP=FALSE
Thanks for commenting. I also found that post, but from further reading it appears that the
MeDIP=TRUE
setting is required for Methyl DIP sequencing experiments:From: https://support.bioconductor.org/p/68549/
I think I should probably contact the package maintainers.
Thanks,
Kyle