I have used Bismark for aligning RRBS reads and now I should run methylKit on bismarks' output, like CpG and bedGraph and .Cov. How can I use these files instead of bam file.
Methylkit manual said it accepts bismark bam file however, I could not handle it. It gives error when I used bismark sorted bam files, and methRead function does not work too.
Does anybody know pipeline for working with bismark output files? I don't mean bam/sam.
Two step solution: Use coverage2cytosine to convert .cov file to an intermediate file. Then a one-liner awk command or some other script to convert the intermediate file to a methRead input file.
The reason for this two-step process is that .cov lacks strand information.
cov file format: chr pos pos %meth #meth #unmeth
intermediate file format: chr pos strand(+/-) #meth #unmeth na na
methRead input file format: id chr pos strand(F/R) [#meth+#unmeth] %meth %unmeth
Where # is (integer) count. The intermediate file contains CG sites not covered by any read and you should discard those in your script.
I have created a file as methylKit input file; it likes the methylKit's test and control samples:
chrBase chr base strand coverage freqC freqT
1 chr21.9764539 chr21 9764539 R 12 25.00 75.00
2 chr21.9764513 chr21 9764513 R 12 0.00 100.00
3 chr21.9820622 chr21 9820622 F 13 0.00 100.00
4 chr21.9837545 chr21 9837545 F 11 0.00 100.00
5 chr21.9849022 chr21 9849022 F 124 72.58 27.42
But when I used methRead to read the file, I got errors from this step:
Exception: Argument 'file' should be a single value not 7 values.
at #17. getVector.Arguments(static, s, length = length, .name = .name)
- getVector.Arguments() is in environment 'R.utils'
at #16. getVector(static, s, length = length, .name = .name)
- getVector() is in environment 'R.utils'
Error: Argument 'file' should be a single value not 7 values.
Can you help to solve this problem? Thank you very much!
Which files did you use the for
methRead
function?If it gives you an error, can you post the command?