Hi,
I meet some problem when running Bisnark+DMAP,The data I use is the [Test dataset 1 for DMAP](http://biochem.otago.ac.nz/assets/software/test_MDS_data_large.tar.gz) from http://biochem.otago.ac.nz/research/databases-software/
The commands I used are:
bismark_genome_preparation ./
bismark -q -n 1 -l 40 ~/projects/Methylation/BS_seq/reference1/ ../MDS_chr1_maps.fastq
bismark_methylation_extractor -s --bedGraph --counts --buffer_size 10G --cytosine_report --genome_folder ~/projects/Methylation/BS_seq/reference1/ MDS_chr1_maps.fastq_bismark.sam
diffmeth -e 40,220 -g ~/projects/Methylation/BS_seq/reference1/Homo_sapiens.GRCh37.75.dna.chromosome. -R MDS_chr1_maps.fastq_bismark.sam |awk -f ~/software/meth_progs_dist/src/getcpgpcmeth.awk >diffmeth.single.MDS.list
So I get two files -- the MDS_chr1_maps.fastq_bismark.CpG_report.txt
and the diffmeth.single.MDS.list
,In the former I get these information:
1 10469 + 0 0 CG CGC
1 10470 - 0 0 CG CGA
1 10471 + 0 0 CG CGG
1 10472 - 0 0 CG CGC
1 10484 + 0 0 CG CGG
1 10485 - 0 0 CG CGG
1 10489 + 0 0 CG CGC
1 10490 - 0 0 CG CGG
1 10493 + 0 0 CG CGC
1 10494 - 0 0 CG CGG
1 10497 + 43 8 CG CGG
1 10498 - 0 0 CG CGG
1 10525 + 47 5 CG CGC
1 10526 - 58 11 CG CGG
1 10542 + 48 4 CG CGA
1 10543 - 63 7 CG CGG
1 10563 + 40 12 CG CGC
1 10564 - 64 6 CG CGT
1 10571 + 46 5 CG CGC
1 10572 - 64 5 CG CGG
1 10577 + 0 0 CG CGC
1 10578 - 44 25 CG CGA
1 10579 + 0 0 CG CGG
1 10580 - 39 30 CG CGC
1 10589 + 0 0 CG CGG
1 10590 - 63 5 CG CGG
and the latter:
1 10497 84.31
1 10525 86.78
1 10542 90.98
1 10563 85.25
1 10571 91.67
1 10577 63.77
1 10579 56.52
1 10589 92.65
1 10609 -
1 10617 -
1 10620 -
1 10631 -
1 10633 -
1 10636 -
1 10638 -
1 15720 -
1 15749 -
1 15769 -
1 15789 -
1 15834 -
1 15849 94.74
1 15865 94.74
1 15882 100.00
1 15912 94.74
1 17562 100.00
I don't know how to explain the differences between them such as 10577
. Have I done anything wrong or it's a real bug in the software?
Moreover,the latter I think just show me the cytosine on the + strand.if right how could I get the information about the - strand.If not how could I get the information about the strand directly? I see the introduction of the options and the but couldn't get any help.
Thanks for your help
You're piping things through an awk script before writing that file. What's in the awk script? Perhaps that's causing weird results (and anyway, it's hard to know what the 3rd column even means without more information).