how can I combine uniprot's random and limit functionality?
1
1
Entering edit mode
10.4 years ago
arronslacey ▴ 320

Hi - thanks to some help on here I am getting used to querying uniprot. A question I have is about how to use both the "random" and "limit" functionalities in the same query.

For example, I have:

http://www.uniprot.org/uniprot/?query=reviewed:yes+AND+organism:9606+AND+annotation:(type:transmem)&format=fasta&random=yes&limit=10

which I am trying to get some transmembrane proteins, randomize the order in which they appear, and then choose the first 10. I would expect to see different(!) proteins each time I run this query if I am using the random flag. however, I obtain the same 10 proteins each time. it seems the random flag is being ignored. maybe this isn't what it's used for an I have it wrong.

can I use the random and limit flags together in such a way?

EDIT

From this thread and using Elisabeth's answer I have used the uniprot query and wrapped in a little R script. the result is similar to Pierre's answer in that thread, however my campus firewall doesn't allow me to connect via mysql. Here's the script:

library(XML)
library(httr)

suppressPackageStartupMessages(library("methods"))
search.term="reviewed:yes+AND+organism:9606+AND+annotation:(type:transmem)&random=yes"
for (i in 1:10){
url.name=paste0("http://www.uniprot.org/uniprot/?query=",search.term)
url.get=GET(url.name)
url.content=content(url.get, as="text")
links <- xpathSApply(htmlParse(url.content), "//a[contains(@href, 'fasta')]",xmlGetAttr, "href")
fasta_link<-paste0("http://www.uniprot.org",links[1])
download.file(fasta_link,"myseqs.fasta",quiet= FALSE,mode="a")
}

This downloads 10 transmembrane sequences chosen at random. haven't quite worked out how to do this without replacement yet, but will update when I have. download.file "mode" has been set to append (a flag) as I wanted to collect all sequences into one file.

Cheers.

uniprot • 2.3k views
ADD COMMENT
2
Entering edit mode
10.4 years ago

The &random=yes flag was designed to pick a random entry from a query and to work with the html format only, for interactive use. We can look into providing it for other formats as well.

You might want to give this tool a try: http://www.rocrooks.co.uk/biology/uniprot-random.php

ADD COMMENT
0
Entering edit mode

Thanks elizabeth, that would be a really useful. in the mean time, I have used your answer from Is it possible to download a random set of proteins? (fasta files) and made a little R script which will grab the fasta file from a random page. can loop through the uniprot query as many times as you like

ADD REPLY

Login before adding your answer.

Traffic: 1811 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6