Hi - thanks to some help on here I am getting used to querying uniprot. A question I have is about how to use both the "random" and "limit" functionalities in the same query.
For example, I have:
which I am trying to get some transmembrane proteins, randomize the order in which they appear, and then choose the first 10. I would expect to see different(!) proteins each time I run this query if I am using the random flag. however, I obtain the same 10 proteins each time. it seems the random flag is being ignored. maybe this isn't what it's used for an I have it wrong.
can I use the random and limit flags together in such a way?
From this thread and using Elisabeth's answer I have used the uniprot query and wrapped in a little R script. the result is similar to Pierre's answer in that thread, however my campus firewall doesn't allow me to connect via mysql. Here's the script:
for (i in 1:10){
url.content=content(url.get, as="text")
links <- xpathSApply(htmlParse(url.content), "//a[contains(@href, 'fasta')]",xmlGetAttr, "href")
download.file(fasta_link,"myseqs.fasta",quiet= FALSE,mode="a")
This downloads 10 transmembrane sequences chosen at random. haven't quite worked out how to do this without replacement yet, but will update when I have. download.file "mode" has been set to append (a flag) as I wanted to collect all sequences into one file.
Thanks elizabeth, that would be a really useful. in the mean time, I have used your answer from Is it possible to download a random set of proteins? (fasta files) and made a little R script which will grab the fasta file from a random page. can loop through the uniprot query as many times as you like