Problem with Bedtools
2
0
Entering edit mode
6.4 years ago

Hello everyone.

I'm having problems with bedtools coverage I have a .gff file with positions and a bam with the results from a RNA-seq experiment. I want to know how many reads has each of my positions however is not working.

I have used the following codes:

bedtools coverage -a.gff -b .bam
bedtools coverage -a.gff -b .bam -s

And it is not working :/

RNA-Seq • 4.2k views
ADD COMMENT
1
Entering edit mode

Not working is a tremendously detailed description. Please elaborate.

ADD REPLY
0
Entering edit mode

Please use the formatting bar (especially the code option) to present your post better. I've done it for you this time.
code_formatting

ADD REPLY
0
Entering edit mode
bedtools coverage -a.gff -b .bam
bedtools coverage -a.gff -b .bam -s

I run the code and nothing happens, however when I do:

bedtools coverage -abam .bam -b .gff -s

screenshot: https://ibb.co/jcpTtd

I get results but not the ones I want. It generates the coverage of features in the .gff on the features in the .bam and I want the opposite

ADD REPLY
0
Entering edit mode

bam files do not have features, they have reads / sequences mapped.

I get results but not the ones I want. It generates the coverage of features in the .gff on the features in the .bam and I want the opposite

I am not sure what you want, could you explain? Maybe bedtools intersect can do what you want?

ADD REPLY
0
Entering edit mode

Ok I want to know how many reads from the .bam overlap with each of the positions I have on my .gff.

I tried bedtools intersect as well, and it is the same.

May it work if I convert the bam to bed?

Thanks in advance

ADD REPLY
0
Entering edit mode

Regarding the error message posted as an image (which should be avoided, it is better to copy / paste the text of the message), see this:

https://sites.google.com/a/case.edu/hpc-upgraded-cluster/cluster-faq/running-jobs#TOC-The-job-I-started-stops-terminates-fails-in-the-middle

ADD REPLY
1
Entering edit mode
6.4 years ago
h.mon 35k

And it is not working :/

"It is not working" is not really helpful, copy the error message, or, if no error message, explicitly say "there is no error message and bedtools produces an empty file", or something to this effect.

Anyway, I believe your command should be:

bedtools coverage -a file.bam -b file.gff

As per documentation:

As of version 2.24.0, the coverage tool has changed such that the coverage is computed for the A file, not the B file.

edit:

You are right, the command I gave above is the opposite of what you want. You really want

bedtools coverage -a file.gff -b file.bam

Regarding the error you got (slurmstepd: error: task/cgroup: unable to add task[pid=XXXXXXX] to memory cg '(null)), it is related to excessive memory use. You can ask for more memory or try different queue. If you sort both bam and gff, you can pass the argument -sorted to bedtools, which is both faster and uses less memory.

May it work if I convert the bam to bed?

Yes, should work both with bam or bed.

ADD COMMENT
0
Entering edit mode

But that line generates just the opposite of what I want

ADD REPLY
0
Entering edit mode

The problem with:

bedtools coverage -a file.gff -b file.bam

is that it is not running, it just keep "thinking" until the computer kills it

ADD REPLY
1
Entering edit mode
  • You are using the Slurm Workload Manager, which controls how much resources (mainly memory, CPUs, time) you can use.

  • The error message you posted on that image relates to Slurm killing you job because it used too much resources.

  • I pointed to the fact if you sort the gff and bam file, memory use should be lower and running times should be faster, and you might successfully run bedtools coverage without Slurm killing it.

ADD REPLY
0
Entering edit mode

Ahh ok, sorry either my english and bioinformatics skills are low. Ok I understand I will sort it and I will ask for more resources thank you!

ADD REPLY
0
Entering edit mode

I finally got it, was a memory problem as u said, thank you!

ADD REPLY
1
Entering edit mode
6.4 years ago
Jeffin Rockey ★ 1.3k

Use the -d option to get per base information using bedtools coverage

ADD COMMENT
0
Entering edit mode

Thank you, the output is closer but I think it is no exactly the one I need. Here it is an example of the output(in bold the new columns generated):

Um_chr06        frame_-2_peak_9719      noCDS   214798  215077  .       -       .       90      **197     11**
Um_chr06        frame_-2_peak_9719      noCDS   214798  215077  .       -       .       90      **198     11**
Um_chr06        frame_-2_peak_9719      noCDS   214798  215077  .       -       .       90      **199     11**
Um_chr06        frame_-2_peak_9719      noCDS   214798  215077  .       -       .       90      **200     11**

I need 1 result per peak not 10, I think the best option will be convert the bam to bed and do bedtools intersect

ADD REPLY
1
Entering edit mode

Reminder: Use the code formatting option to better your post.

ADD REPLY
1
Entering edit mode
  • Please give gff itself as -a in bedtools coverage. From the output shown seems like the command was executed the other way.
  • bedtools can work directly with bam for intersect in case it itself is required.
ADD REPLY
1
Entering edit mode
  • Please give gff itself as -a in bedtools coverage. From the output shown seems like the command was executed the other way.
  • bedtools can work directly with bam for intersect in case it itself is required.
ADD REPLY

Login before adding your answer.

Traffic: 1920 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6