How to programatically run a blast to ensembl and obtain GFF from own FASTA
1
1
Entering edit mode
9.9 years ago
juanma_lace ▴ 20

Hi,

I'm using python and I have a wheat genome portion in .FASTA format (nucleotides).

I want to run BLAST against ENSEMBL and obtain genes annotations in .GFF3 programatically in order to use those items in a custom pipeline.

Any ideas?

Thank you in advance

EDITED:

My vision is:

  1. Run BLAST with own sequences against ENSEMBL
  2. Download .GFF files from ENSEMBL of the genome you're blasting
  3. Use blast results and translate GFF coordinates to your own .FASTA data
gff3 python blast ensembl • 4.4k views
ADD COMMENT
0
Entering edit mode

¡Bienvenidos a BioStar! We'd be happy to help, but it looks like you need to take a bit more time to clarify what you're doing. GFF3 is a format that can encode a wide variety of genomic features. From your question it is not clear what features you are looking for or how BLAST/ENSEMBL will help you identify those features.

ADD REPLY
0
Entering edit mode

Thanks, I've edited the question

ADD REPLY
1
Entering edit mode
9.9 years ago

Identifying genes in eukaryotes is a bit more complicated than simply aligning proteins with BLAST. If you have a particular set of reference proteins or ESTs, you can splice align these (with programs like GeneSeqer and GenomeThreader) to get a good first approximation of gene structure.

You can also use gene predictors like SNAP or AUGUSTUS to predict genes. These do not require and protein or transcript sequences, but they typically have a substantial number of false positives.

Tools like Maker and EVM combine these two approaches (spliced alignment and ab initio gene prediction), and produce much more reliable annotations. However, they are a pain to set up and more complicated to run.

Sorry there is no easy answer to this question!

ADD COMMENT
0
Entering edit mode

I see, thank you for your answer. Upvoted

ADD REPLY

Login before adding your answer.

Traffic: 1734 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6