Question

Problem Fetching Sequences From A Bed File With Galaxy

0

Entering edit mode

12.1 years ago

joshualevipayne ▴ 70

I've made a .bed file for the ChIP-exo data for transcription factor Phd1 published by Rhee & Pugh (2011) Cell.

I'm trying to do something extremely simple: Fetch the corresponding sequences from the S. cerevisiae genome (June 2008) using the Galaxy server. The first 10 lines (out of 967) of the .bed file are:

chr1    8       102
chr1    20874   20968
chr1    190042  190136
chr1    190477  190571
chr1    191398  191492
chr1    230073  230167
chr2    165355  165449
chr2    165685  165779
chr2    376055  376149
chr2    376173  376267
chr2    378953  379047

When I feed this into Galaxy, it tells me "967 warnings, 1st is: Unable to fetch the sequence from '8' to '94' for chrom 'chr1'. Skipped 967 invalid lines, 1st is #1, "chr1 8 102"

I have no idea where this error is coming from. Why would it try to fetch a sequence from locations 8 to 94, when the first line specifies the locations 8 to 102? Bizarre.

Any ideas?

Thanks, Josh

bed galaxy • 3.6k views

ADD COMMENT • link updated 11.0 years ago by orangehu8 • 0 • written 12.1 years ago by joshualevipayne ▴ 70

1

Entering edit mode

maybe the chromosome name is different from 'chr1' in the FASTA file?

ADD REPLY • link 12.1 years ago by Pavel Senin ★ 1.9k

0

Entering edit mode

good catch - though I must say that error message was as misleading as it gets, the OP should make sure that the intervals that are extracted do indeed span the right range!

ADD REPLY • link 12.1 years ago by Istvan Albert 102k

0

Entering edit mode

you are right on that, I just wanted to see if OP made it sure ;). btw, is there a wiki or faq tag here? - so some sort of FAQ can be eventually produced

ADD REPLY • link 12.1 years ago by Pavel Senin ★ 1.9k

0

Entering edit mode

that is a good idea, I have beent thinking about the best way to create a series of posts that should be required reading and would solve recurring problems. Could be a single thread of posts tagged as faq.

ADD REPLY • link 12.1 years ago by Istvan Albert 102k

0

Entering edit mode

Hi joshualevipayne:

I faced a similar problem. I want to consulte you, just 3 rows like you give can get the result? No need transcript id or gene id or something else?

ADD REPLY • link updated 5.4 years ago by Ram 45k • written 11.0 years ago by orangehu8 • 0

score 0 · Answer 1 · 2013-03-25

0

Entering edit mode

12.1 years ago

joshualevipayne ▴ 70

Thanks seninp, that's it. The chromosomes should be specified with Roman numerals. All set now.

ADD COMMENT • link 12.1 years ago by joshualevipayne ▴ 70