I've made a .bed file for the ChIP-exo data for transcription factor Phd1 published by Rhee & Pugh (2011) Cell.
I'm trying to do something extremely simple: Fetch the corresponding sequences from the S. cerevisiae genome (June 2008) using the Galaxy server. The first 10 lines (out of 967) of the .bed file are:
chr1 8 102
chr1 20874 20968
chr1 190042 190136
chr1 190477 190571
chr1 191398 191492
chr1 230073 230167
chr2 165355 165449
chr2 165685 165779
chr2 376055 376149
chr2 376173 376267
chr2 378953 379047
When I feed this into Galaxy, it tells me "967 warnings, 1st is: Unable to fetch the sequence from '8' to '94' for chrom 'chr1'. Skipped 967 invalid lines, 1st is #1, "chr1 8 102"
I have no idea where this error is coming from. Why would it try to fetch a sequence from locations 8 to 94, when the first line specifies the locations 8 to 102? Bizarre.
Any ideas?
Thanks, Josh
maybe the chromosome name is different from 'chr1' in the FASTA file?
good catch - though I must say that error message was as misleading as it gets, the OP should make sure that the intervals that are extracted do indeed span the right range!
you are right on that, I just wanted to see if OP made it sure ;). btw, is there a wiki or faq tag here? - so some sort of FAQ can be eventually produced
that is a good idea, I have beent thinking about the best way to create a series of posts that should be required reading and would solve recurring problems. Could be a single thread of posts tagged as faq.
Hi joshualevipayne:
I faced a similar problem. I want to consulte you, just 3 rows like you give can get the result? No need transcript id or gene id or something else?