Forum:Anyone Use R For Their Bioinformatics Work?
2
0
Entering edit mode
11.1 years ago
arronslacey ▴ 320

Hi everyone - was just wondering if anyone has strayed away from perl/python to do their bioinformatics work. I've had a play around with some packages in Bioconductor, but the packages are far too focused on an applied problem, rather than a generalized tool. I love R and was just wondering what the community for it is like in Bioinformatics.

r • 4.7k views
ADD COMMENT
3
Entering edit mode

I hear they have the R on computers now

ADD REPLY
2
Entering edit mode

Looking at the questions tagged as r in biostar will give you a rough idea

ADD REPLY
2
Entering edit mode

This might be a better forum post as it is more of a discussion.

ADD REPLY
1
Entering edit mode

I don't think that this a EDIT: good (not legitimate) question at all, because it is not at all specific enough, and asks for general opinions and ask for subjective opinions and is argumentative, also it is a lazy one because applications and also people ("anyone using them") can be easily googled. If you use the search or just click on the tag the result is 30 (!) pages of questions on Biostars. Maybe you could rephrase your question to be more specific?

ADD REPLY
0
Entering edit mode

Hi Michael - I suppose what I was asking was, outside of Bioconductor where the R packages and algorithms appear very applied to me (as I am new to bioinformatics), are there more generic frameworks like biopython/bioperl where the focus is on producing algorithms that querey databases, build simple and statistical models such as profile hmms/N-N? I found Sudeep's answer helpful showing that I could list questions on Biostars as tags and am working with the NCBI2R package right now which I find useful. I am in no way trying to disrespect the work in Bioconductor, but rather looking for an R equivalent of biopython/bioperl.

ADD REPLY
1
Entering edit mode

Bioconductor is that equivalent (and more, in a sense). You will find the infrastructure to do what you like within Bioconductor, though it may take a bit of devoted digging to find packages that you find most useful.

ADD REPLY
0
Entering edit mode

I think we should first try to improve your question and make it more specific. I don't think it is possible to answer your original question because it is not obvious, what you in fact are asking. Biostars is good in answering specific questions, not general discussions.

ADD REPLY
1
Entering edit mode

Perhaps folks don't agree fully with your sentiment regarding bioconductor. Bioconductor sees over 10000 unique IP address downloads per month (not counting any of the mirrors):

http://bioconductor.org/packages/stats/bioc/BiocInstaller.html

ADD REPLY
0
Entering edit mode

I am not doubting at all how successful it is, I have used biopython in particular extensively, but after being introduced to R and making some good headway as far as stats analysis and allowing myself to move away from SPSS I wanted to scope out how people use R in bioinformatics. I have just written scripts that get statistics such as counts, histograms from fasta files in R, and was curious about how others use it. Bioconductor at first glance looks like a great place to share how a particular problem was solved rather than a "suite" to develop generic tools. as a beginner in both R and bioinformatics this was more of the thing I was looking for. the fact the response on here to this question is so good, which is after all what forums like this encourage, tells me to keep at it with Bioconductor.

ADD REPLY
4
Entering edit mode

Check out the GenomicRanges package for range-based infrastructure, GEOquery for accessing public genomics data, annotation packages, biomaRt, Gviz (omics visualization), AnnotationHub, and Biostrings. For stats, the choices are even more extensive. The fact that bioconductor provides both infrastructure and finished products for end-to-end analysis is a strength of the project.

ADD REPLY
0
Entering edit mode

thanks sean, very useful.

ADD REPLY
8
Entering edit mode
11.1 years ago
rmflight ▴ 90

Seriously? Why do you think those packages were written? To solve particular problems, yes. However, in many cases you will find that for all those packages that solve a specific problem, they depend on other, much more general packages. For example, if you work with genomic ranges and sequences, you can make use of the GRanges and Biostrings packages, that do a great job of making it easy to manipulate these types of objects. If you work with microarray data, then there are the annotation packages for specific manufacturer chips, and packages for working with general spotted arrays. Also packages for general manipulation of the raw data if you want to do that. Also, for sequencing, there are packages that provide interfaces to BAM and SAM files, and all the packages that do different types of statistics on sequencing data. Bioconductor has even made inroads into Mass Spec data.

If people are not using R / Bioconductor for raw data processing due to volume / memory req, they are definitely using it to analyze counts / values / etc due to it's statistical capabilities.

ADD COMMENT
0
Entering edit mode

Oh yeah, forgot to mention that you can also query a lot of public databases, including GEO, UCSC genome browser, and biomart, directly from R.

ADD REPLY
0
Entering edit mode

Not to mention dozens of "CRAN" (non-bioconductor) packages like seqinR, ADE4, vegan, abd, sequences...

ADD REPLY
6
Entering edit mode
11.1 years ago

There's an extremely high level of R/Bioconductor usage in bioinformatics, with some exceptions (e.g., I assume people working mostly on protein docking aren't doing that in R). Pretty much anytime you need to compute a whole lot of statistics (not uncommon in bioinformatics) you end up using R.

ADD COMMENT

Login before adding your answer.

Traffic: 2294 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6