Can I always use Galaxy?
2
3
Entering edit mode
9.5 years ago
zizigolu ★ 4.3k

Hey guys,

I was trying to learn genome profiling newly (since a week ago!!!), suddenly few days ago I knew that I can do some of my analysis in galaxy free from my OS and all.. does this mean that I can do my supervisor's tips all in galaxy and without typing code in cmd?

galaxy sequencing • 2.7k views
ADD COMMENT
3
Entering edit mode
ADD REPLY
0
Entering edit mode

Thank you

ADD REPLY
3
Entering edit mode
9.5 years ago
Dan D 7.4k

I'll summarize the linked thread by saying "not necessarily." Here are some questions you'll want to ask yourself (or your supervisor). If the answer is "yes," then it works in Galaxy's favor:

  • Is the analysis workflow rigid and constant?
  • Are the workflow results few in number and small in size?
  • Can we avoid making complex or nuanced decisions about the next step at every point in the analysis workflow?
  • Is there a Galaxy instance out there (there are others besides the main public one at usegalaxy.org) which has all of the tools that we need?

If Galaxy still seems appropriate after asking these questions, then it would likely be useful to peruse the public workflows in the "Workflows" tab to see if anyone's replicated all or part of what you want to do.

Lastly, if you have specific questions about Galaxy, it has its own Biostar instance where a lot of really helpful and knowledgeable people hang out.

ADD COMMENT
0
Entering edit mode

Thanks a lot Dan,

Yes for two first questions but I don't know about questions 3 and 4. Anyway galaxy at least could save me from stressful condition I was experiencing for genome coverage using bedtools!!

ADD REPLY
1
Entering edit mode
9.5 years ago

Galaxy is certainly a useful platform, and it's helped thousands of scientists run bioinformatics workflows of various types. I applaud its development team for the work they've done. So in general, the answer to your question is 'yes', depending on workflows you're using.

That said, if you're planning to do a genomics project of moderate length (>2-3 months), I would strongly recommend that you learn to build and run your workflows* from a linux/unix command line terminal (ideally using HPC resources, if they are available). This approach is faster, more flexible and the skills will benefit your career. I can suggest specific skills to start learning if need be, and I'm sure the community on this forum will have useful comments to add.

If you don't have access to a high-performance compute node, one solution is would be to use iPlant (http://www.iplantcollaborative.org/). iPlant's Atmosphere cloud service (http://www.iplantcollaborative.org/ci/atmosphere) allows you to launch a virtual machine instance, and upload and analyze your data in the cloud.

The learning curve can be steep, but it will pay dividends in the future.

*Version control (git,svn) is something I highly recommend as well, because it will save you a considerable amount of time down the road. Accounts on github and bitbucket (and others) are free and you can save your code there.

ADD COMMENT
0
Entering edit mode

Thank you,

You know, I don access to huge computing devices and no enough time to learn linux right now then I think I should try iPlant too..

ADD REPLY
1
Entering edit mode

You're welcome. To clarify: iPlant launches linux virtual machines, but you can access using a program like VNC Viewer if you prefer to use a remote desktop (vs just a command-line interface).

ADD REPLY
0
Entering edit mode

Cheers R,

ADD REPLY
0
Entering edit mode

Hello Mr Taylor,

I went to http://www.ensembl.org/biomart/martview/

You imagine I need to get "the 2000 bp (upstream of the initiating ATG) of, non-coding sequence for my interest genes in Attributes section-sequences, I should select Flank (Gene) option or 5' UTR?

in header information, gene information and transcript information, which option I should select?

I got confused because by selecting whichever of these option, result is different and I don know which is my right sequence..., I need to predict promoter by NNPP and confirm that by YASS, then retrieving the exact needed sequence is important..

Thank you

ADD REPLY
1
Entering edit mode

Hi Sarah:

Given those choices I would selected 'flank', but I'm not sure which R package(s) you'd be using so it's hard to provide more specific advice. Bear in mind that you'll need to extract the region upstream of the ATG in a strand-specific way (different directions depending on whether the gene is on the "+" or "-" strand.). GenomicRanges (here's a link to a tutorial) can do this for you, as can bedtools.

I hope this helps.

Taylor

PS: A minor point: it's a good idea to cross-post questions like this on the main Biostars board so you can get a faster answer and so other people can find it more quickly in the future.

ADD REPLY
0
Entering edit mode

Thank you,

I already asked this question in biostar separately, I got the basic idea but some doubts in details yet..

Sincerely yours

ADD REPLY

Login before adding your answer.

Traffic: 2324 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6