Entering edit mode
6.6 years ago
Qingyang Xiao
▴
160
Generally speaking, which programming language is better, R and Python? Personally, I used Unix system in supercomputers plus R in my laptop and I didn't find much uncomfortable stuff. For RNA-Seq, from fastq files to differentially expressed gene data and from DE to pathway/co-expression network.
So overall, what's the advantage of Python in these kinds of analysis?
There is no "better" programming language, depends of your needs and your skills. Use the one you are confortable with.
R is generally used to compute statistical and mathematical process but also some great packages to plot your data
Python is mostly used to hundle text file management as fastq files, there is less package than R to plot your data in python (to discuss below)
My way to go in RNA-seq is to use bash to align and count. Then put the count file in R and do the statistical and plotting parts
I don't really need python on this kind of analysis.
Well, about that, matplotlib, seaborn, plotly, bokeh, altair,... plenty of choices :)
I have to say, in future I’m going to switch to doing everything in Python I think. I used R to plot all my data, with ggplot ported to python and all the other options (especially plotly), I don’t really see as compelling a reason to use it anymore (since I don’t do much in the way of stats or modelling). Having to write 2 separate sets of code is silly really... not to mention I find R’s dataframe structure much more confusing than pandas frames or simple arrays and matrices.
Agree, but I was thinking about DESeq2, in a RNAseq analysis, for example. Do you know a way to use it throught python ? Maybe you could plot whatever you want with python as much as R but for statistical purpose R is still leading.
The community of differential expression analysis clearly chose R, and there is no way around that. R definitely is the only sensible choice here.
But for statistics in general python can do everything as well as R ;-)
Yeah I agree. Personally, I'd like to see a DEseq implementation or similar in pure python, but its not the end of the world to have to dip in to it for the one or two necessary analyses.
Though it probably wouldn't be that hard to write a python wrapper around the Rscript :P
Casual reminder that we should be writing and thinking about graduating.
Shhhhh....
(My thesis doc is open on my other screen, so its all under control haha)
The one thing that I've noticed in recent years is that many languages now 'talk' to each other via libraries or wrapper functions, which is of course a good thing. HTSlib was a great thing, for example, and R has interfaces with Python and JAVA< and probably others. The one language that may still like to keep to itself, though, is Perl (correct me if wrong).
See also ursalabs, developing Apache Arrow, shared data structure for (currently) C, C++, Go, Java, JavaScript, Python, Ruby and Rust.
Of course there are packages for that in python but I think that the pool of packages is more important and more supported in R. I'm a huge fan of python but I switch to R to visualize my data :)
Sure, whatever makes you feel comfortable :-p
haha, that quote. Is it supposed to sound really sarcastic "a few weeks of tinkering". HAHA
Maybe I will give it a try for my upcoming visualization :)
Belgian guy?
Me ? No ! Why ? Did not get the joke if there is one :)
No, the Jake VanderPlas dude. With a name like yours, tu es Français
You replied on my comment that why I got confuse. You're correct I'm french and for this guy, his Twitter say "Seattle"
On parle aussi le francais en Belgique
This question is pointless. It entirely depends on your application and preferences (both personal and those from the community).
So, Python vs R in general?
Oh you mean for RNA-seq analysis?
If you don't know for sure what you are asking, how should we answer? :-)
I would say python personally. I find it to be more widely used and more versatile.
Hi Qingyang Xiao ,
Don't forget to follow up on your questions/threads.
If an answer was helpful, you should upvote it; if the answer resolved your question, you should mark it as accepted. You can accept more than one if they work.
Please do this for your previous posts as well.
Bioinformatics has no language. It is one of these polyglot beasts like web development. You've got to pick up multiple languages to proceed. Nonetheless, if I wanted to get to diffexes and coexes quickly, I'd use R.