Does anyone familiar with downloading data from MG-RAST? I have more than 100 metagenome ids that need to be downloaded in an efficient way. I found this link at MG-RAST (http://api.metagenomics.anl.gov/1/api.html#download) but couldn't manage to download those 100 metagenomes using their ids (e.g. 4441908.3). I dont want to download one by one with individual ids as it will take ages..!!
that was very helpful. I have just gone through the MG-RAST manual but didn't get much info about the "download stages or
file=xxx/stage=xxx
". As you mentioned,file=425.1
for predicted rRNA, Do you know what file no. should I use for raw original submitted metagenome fasta sequences? I tried "file=100.2
" but not sure if it is right..!!Hey, I'm not sure you can gain access to the raw data by the api, however, I think file=100.2 contains the reads/contigs that passed quality filtering. There's probably also a file that contains the reads/contigs that didn't pass QC, so you could combine those if you really wanted them. You could always ask at the mg-rast mailing list..
Thanks. Now I have decided to go for reads that passed QC filtering and dereplication stages..!!
Hi, Thanks for these details about how to download data from the MG-RAST api. Did you add your webkey to access data that is not public yet? Or were these public metagenomes? I tried adding my webkey:
but I just get a summary of the file info (
bp_count
etc), and I'm unable to download the fasta file.Thank you!
Katrine
For downloading raw data (data uploaded by MGRAST user as input data), use following: file=050.1 Or you can check for yourself how is the download address constructed by inspecting the "Download" button element and the url to which it leads.
Example: In the download page "http://www.mg-rast.org/mgmain.html?mgpage=download&metagenome=mgm4549958.3/MG_RAST_sub/BLANES_2010_cDNA_SURFACE_0.8.3__ILLUMINA.fna" go to the "Processing step" -> "0. Upload" and inspect the download button element on the right. Here it is "http://api-ui.mg-rast.org/download/mgm4549958.3?file=050.1"
see example: http://metagenomics.anl.gov/metagenomics.cgi?page=DownloadMetagenome&metagenome=4447943.3
http://api.metagenomics.anl.gov/1/download/mgm4447943.3
ref: http://adina-howe.readthedocs.io/en/latest/mgrast/index.html
Hello, what language is this besides curl? I have a windows computer and am using curl through the command prompt, and would like to do something similar to what you described.
It's Bash. I presume you could get Bash working on Windows with Cygwin or something but I haven't used Windows since XP so I can't really say. Alternatively, it probably doesn't take much effort to make a simple while read line loop with something that works on Windows by default like Python (?) or Java (?). If you plan to do lots of bioinformatics in the future, I suggest you ditch Windows for Linux or OS X.
Update- I downloaded Git Bash for windows, and I think I am having success using bash and curl with the command you listed. I am not very experienced with the Bash language yet, so I haven't added in any echo tests to see if the script is working the way I think it is. I do agree with you that windows is a hassle, but many times there are work-arounds. I will continue down this line for the people who use Windows and can not afford to/do not want to switch operating systems.