Does anybody have suggestions for integrating gene ontology DBs or BLAST/refseq into bioinformatics pipelines? My goal was to be able to automatically search my output (filtered by impact refseq transcript ID's and/or HUGO gene IDs) against a gene ontology DB such as DAVID or Refseq, then automatically fetch the results to display in a clinical report. I was thinking of maybe building a web scraping script since there doesn't seem to be much support for this sort of workflow API wise.
why scraping as all those databases are available for download ?