I have a bed file with 13k snps and want to intersect this with a gff file with all the genes annotated for this species. My aim is to find which of my snps fall within genes or other annotated features. Does any one have any suggestions on how to go about this? I'd really appreciate some suggestions and pointers.
Here's my gff file;
head Aradu.Araip_v02.gff | column -t
Aradu.A01 gene 17735860 17739171 ID=Aradu.B2QWP;Name=Aradu.B2QWP;Note=uncharacterized protein LOC100797259 isoform X4 [Glycine max]%3B IPR004332 (Transposase%2C MuDR%2C plant)
Aradu.A01 mRNA 17735860 17739171 ID=Aradu.B2QWP.1;Parent=Aradu.B2QWP;Name=Aradu.B2QWP.1
Aradu.A01 exon 17735860 17736516 ID=Aradu.B2QWP:exon:0;Parent=Aradu.B2QWP.1
Aradu.A01 exon 17737010 17737038 ID=Aradu.B2QWP:exon:1;Parent=Aradu.B2QWP.1
Aradu.A01 exon 17737236 17737802 ID=Aradu.B2QWP:exon:2;Parent=Aradu.B2QWP.1
Aradu.A01 exon 17738326 17738425 ID=Aradu.B2QWP:exon:3;Parent=Aradu.B2QWP.1
Aradu.A01 exon 17738558 17738633 ID=Aradu.B2QWP:exon:4;Parent=Aradu.B2QWP.1
Aradu.A01 exon 17738752 17738962 ID=Aradu.B2QWP:exon:5;Parent=Aradu.B2QWP.1
Aradu.A01 exon 17739071 17739171 ID=Aradu.B2QWP:exon:6;Parent=Aradu.B2QWP.1
Here's my bed file with snps;
head Allsnps.bed | column -t
chrom pos pos Marker conversion.type
Aradu.A01 230496 230496 AX-147207636 Polyhigh
Aradu.A01 230616 230616 AX-147207637 Other
Aradu.A01 231463 231463 AX-147207638 Polyhigh
Aradu.A01 253683 253683 AX-147207640 Other
Aradu.A01 390660 390660 AX-147207661 Nominor
Aradu.A01 405426 405426 AX-147207663 Polyhigh
Aradu.A01 413869 413869 AX-147207665 Polyhigh
Aradu.A01 461328 461328 AX-147207676 Nominor
Aradu.A01 785293 785293 AX-147207749 Nominor
Thanks,
Paul
Thank you @Petr! I used Bedtools and it worked just fine. I agree with you that it is nice to learn new ways of doing something and definitely writing your own scripts for a job is good practice for much needed skills.
I will give you more direct feedback a little later about how well the script worked. Thanks again!