Question

Bedmethyl file format

0

Entering edit mode

11 months ago

njornet ▴ 20

I want to study methylation in some samples sequenced with nanopore. I performed basecalling with nanopore with the model 5mCG_5hmCG and then generated the bedmethyl file with modkit. I looked at the ENCODE description of the format but there are things that I don't understand and also that I have in my bed file that are not described in the ENCODE project. These are the first few rows of the file:

NC_000001.11 10220 10221 h 1 + 10220 10221 255,0,0 1 0.00 0 0 1 0 0 0 0

NC_000001.11 10220 10221 m 1 + 10220 10221 255,0,0 1 100.00 1 0 0 0 0 0 0

NC_000001.11 10232 10233 h 1 + 10232 10233 255,0,0 1 0.00 0 0 1 0 0 0 0

NC_000001.11 10232 10233 m 1 + 10232 10233 255,0,0 1 100.00 1 0 0 0 0 0 0

NC_000001.11 10468 10469 h 1 + 10468 10469 255,0,0 1 0.00 0 1 0 0 0 0 0

NC_000001.11 10468 10469 m 1 + 10468 10469 255,0,0 1 0.00 0 1 0 0 0 0 0

NC_000001.11 10469 10470 h 1 - 10469 10470 255,0,0 1 0.00 0 0 1 0 0 0 0

NC_000001.11 10469 10470 m 1 - 10469 10470 255,0,0 1 100.00 1 0 0 0 0 0 0

I assume, the forth column (Name of item) the m means 5mCG and h means 5hmCG but I want to confirm this. So, as you can see, all the entries are duplicated of h and m, from what I understand, and looking at column 11 (Percentage of reads that show methylation at this position in the genome), it annotates all the positions and instead of saving if it is 5mCG or 5hmCG, it puts 100% to the one present. But this last column has some 0s and 1s after the percentage which I have no idea what they are.

If someone could explain in more detail what these mean it would be very helpful. Also, once I have the bedmethyl, is there an easy way with an already done tool to analyse it? Or do I have to extract the regions I'm interested in and see if they are methylated or what?

Thank you.

PD. Sorry for the weird formating of the modkit output, I can't get the lines to have just one line break.

bedmethyl methylation • 2.6k views

ADD COMMENT • link updated 5 weeks ago by lavigne • 0 • written 11 months ago by njornet ▴ 20

1

Entering edit mode

Have you looked at this page that describes the file format: ~~https://nanoporetech.github.io/modkit/intro_bedmethyl.html~~

This link no longer works. See my comment below for new links.

ADD REPLY • link 5 weeks ago by GenoMax 150k

0

Entering edit mode

It looks like theGithub link no longer works. I've looked on the modkit github page directly, but I'm not sure I can find the same documentation. Do you happen to remember what this file contained previously?

ADD REPLY • link 5 weeks ago by lavigne • 0

0

Entering edit mode

Bedmethyl format is described here: https://nanoporetech.github.io/modkit/intro_pileup.html#description-of-bedmethyl-output

Also described at Encode site: https://www.encodeproject.org/data-standards/wgbs/

ADD REPLY • link 5 weeks ago by GenoMax 150k

0

Entering edit mode

Please use 101010 button to format text as code when you want to show monospaced data.

ADD REPLY • link 11 months ago by GenoMax 150k

0

Entering edit mode

Hello! I am trying to produce a BED methyl file using nanopore data as well. I am starting with 5mCG_5hmCG.bam files, and I have been getting stuck somewhere in my pipeline to produce the bed methyl files.

Might you be willing to share your process/pipeline to produce these bed methyl files?

thank you!

ADD REPLY • link 5 weeks ago by lavigne • 0

0

Entering edit mode

if you have a BAM file you can run a single modkit command. example here https://github.com/nanoporetech/modkit?tab=readme-ov-file#constructing-bedmethyl-tables

it is a pretty fast and easy compared to other difficult-to-use bioinfo tools :)

the file format that it outputs is described in the README there also

ADD REPLY • link 5 weeks ago by cmdcolin ★ 4.2k

0

Entering edit mode

thank you! Presumably this single modkit command works if the BAM file is already aligned, correct? This cannot work for unaligned BAM files?

ADD REPLY • link 5 weeks ago by lavigne • 0

0

Entering edit mode

yes you'll likely want aligned BAM files. i haven't used unaligned bam a lot but if you have it can check e.g. https://lh3.github.io/2021/07/06/remapping-an-aligned-bam it shows how to potentially preserve some tags like the MM and ML when creating the aligned bam with the "samtools fastq -OT" method

ADD REPLY • link 5 weeks ago by cmdcolin ★ 4.2k

0

Entering edit mode

Follow the directions from here: https://nanoporetech.github.io/modkit/intro_pileup.html

You will need an aligned BAM file which preserves the MM/ML tags. If you are starting with unaligned BAM files containing MM/ML calls then you will need to do

minimap2 -t use_N_cores -Y -y -ax map-ont your_minimap_index_path <(samtools fastq -@
 use_N_cores -T MM,ML your_unaligned_methyl_call.bam) | samtools sort --write-index -o methyl_
aligned.bam -