guidance for communication online seach engine with python or R script?
1
0
Entering edit mode
6.7 years ago
boaty ▴ 220

Hi guys,

I'm looking for some python packages or R packages to communicate repeatMasker online service from local script. Yes, I can run repeatMasker locally but I don't trouble myself with database update.

I am a lazy person, so, instead of upload fasta then download result manually, I am wondering if there's the way that send my quest to repeatmasker with its parameters and get results out automatically. I know biopython can do this for blast because of API but not for repeatMasker.

thanks

python RNA-Seq online search engine repeatMasker • 1.6k views
ADD COMMENT
1
Entering edit mode

I am a lazy person

That took courage :-)

ADD REPLY
1
Entering edit mode

sorry, l tried to use a metaphor from programmers community

ADD REPLY
3
Entering edit mode
6.0 years ago
boaty ▴ 220

yes, i answer my question I asked 8 months ago. I want to write a script of auto online search which links FISH probe design program of FISH-quant tools. So our biologists who only have Windows can perform probe design script by themselves.

main python tools used is selenium, a excellent web tool for python and java. I also used katalon recorder, a firefox plugin record your action of web navigating and export codes, so you can copy paste directly in your script to reproduce same action. of course, firefox inspector is needed to understand web pages.

Here's my script to upload fasta sequence from .fasta file and online search repeat region with croiss_match option. finally, to get results .masked file.

from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.common.keys import Keys
from selenium.common.exceptions import NoSuchElementException
import  time, re, os
import traceback,logging

try:
    file="your/fasta/file"          #your local fasta file
    seqs=open(file,'r').read()

    driver = webdriver.Firefox()         #open firefox       
    driver.get("http://www.repeatmasker.org/cgi-bin/WEBRepeatMasker")        #go to repeatmasker

    driver.find_element_by_name("sequence").send_keys(seqs)
    driver.find_element_by_xpath("(.//*[normalize-space(text()) and normalize-space(.)='Search Engine'])[1]/following::input[3]").click()   #choose croiss_match algo
    driver.find_element_by_name("submit").click()

    #get results
    still=True
    i=1
    while i < 9 or still==True:
        print("waiting "+str(i*30)+" seconds...")
        time.sleep(30*i)     # wait page to charge and wait results
        try:
            masked=driver.find_element_by_partial_link_text(".masked").click()    #if there's masked results, get it
        except NoSuchElementException:
            #for queued request
            if driver.find_element_by_tag_name('h2').text=='Request Queued':
                try:
                    driver.find_element_by_partial_link_text('.html').click()
                except Exception as e:
                    print(e)
            #for no results found
            if "No repetitive sequences" in driver.find_element_by_tag_name('pre').text:
                still=False
                exit("no repetitive sequences were detected")
        else:
            still =False
        driver.refresh()
        print("page refreshed")
        i+=1
    content=driver.find_element_by_tag_name('pre')  #get sequences of results

    #write result to file
    with open("seqs.masked",'w') as out:
        out.write(content)

except Exception as e:
    print(e)

finally:
    driver.close()

selenium + katalon recorder, very strong combination!!!

ADD COMMENT

Login before adding your answer.

Traffic: 1780 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6