Hi, just FYI I've also posted a variation of this question on seqanswers, I'll update both posts accordingly with any responses.
I've been analysing some WGBS libraries for a couple of different insect species, using Bismark for alignment. For context in most insects methylation levels are low to begin with and non-CG methylation is often considered to be noise.
I have both directional and non-directional libraries for each species, with alignment performed using the non-directional parameter in Bismark for the latter. I've noticed a pattern of non-directional libraries having higher levels of non-CG methylation (as estimated by Bismark's splitting report generated during methylation extraction). Examples as follows from this report - same species, both control groups:
Directional:
Final Cytosine Methylation Report
=================================
Total number of C's analysed: 1441325271
Total methylated C's in CpG context: 10947321
Total methylated C's in CHG context: 1518883
Total methylated C's in CHH context: 7907753
Total C to T conversions in CpG context: 260501288
Total C to T conversions in CHG context: 203098923
Total C to T conversions in CHH context: 957351103
C methylated in CpG context: 4.0%
C methylated in CHG context: 0.7%
C methylated in CHH context: 0.8%
Non-directional:
Final Cytosine Methylation Report
=================================
Total number of C's analysed: 1671579979
Total methylated C's in CpG context: 15071917
Total methylated C's in CHG context: 3393242
Total methylated C's in CHH context: 17650398
Total C to T conversions in CpG context: 360463445
Total C to T conversions in CHG context: 264082345
Total C to T conversions in CHH context: 1010918632
C methylated in CpG context: 4.0%
C methylated in CHG context: 1.3%
C methylated in CHH context: 1.7%
This pattern holds in all samples and in both species I'm studying. Before I conclude that this is the result of quirks of these specific datasets, I was wondering if anyone was aware of a reason to do with the nature of non-directional libraries as to why the non-CG methylation might be reported at a higher level?
It could be:
Thanks for your suggestions, Mark. Point 3 is the kind of thing I was thinking might be the reason, as CpG context is unaffected, reported at similar levels in both directional and non-directional libraries - I assume such issues as 1 and 2 would affect all contexts. I'll try your suggestion of using alternative aligners - thanks again.