Entering edit mode
4.3 years ago
JR
•
0
I have determined some regions of interest in my genome and i want to extract the gene annotations that fall into that regions.
First, I have formatted my initial GFF3 file to obtain a bed file like:
Scaffold_1 1451750 1458451 ID=ANN00021
Scaffold_1 3553514 3558618 ID=ANN00054
Scaffold_2 4024794 4032517 ID=ANN00058
And i have other bed file with the genome regions:
Scaffold_1 133745072 133845072
Scaffold_1 133854352 133954352
Scaffold_1 133806326 133906326
Scaffold_1 133912327 134012327
Scaffold_2 64167277 64267277
I have tried with bedtools but I think cannot deal with my question.
I will appreciate any suggestion. Thanks!
Can you also tell us exactly what you tried and how it failed? I'd recommend you take a look at bedops - the manual helps visualize your expectations very well: https://bedops.readthedocs.io/en/latest/content/overview.html#about-bedops
Thank you for the bedtops, sounds good. I will try to solve with that.
you just need a little python. ie, if range start < annotation start < range end. and same for annotation end. or just make a range and ask if it's in that range... if u are going to have to do this sort of thing regularly u need to learn py.