Forum:Where should a biologist start when learning computational biology & systems biology
2
5
Entering edit mode
3.4 years ago
Saber ▴ 50

Hello,

I study molecular biology as an undergraduate. My graduate school goal is to study systems biology or computational biology. My knowledge of biology is reasonable, but that of computer science and mathematics is limited.

In terms of computer science and mathematics, from where, should I start and what topics should I cover?

I would appreciate any course, book, or platform recommendations.

thanks

data-science systems-biology computational-biology • 3.3k views
ADD COMMENT
0
Entering edit mode

Try to understand Python and R will be helpful, and it will be regarding to your research in future. For me, I also learn MATLAB since package for my research is only in MATLAB.

ADD REPLY
3
Entering edit mode
3.4 years ago

You should start by learning a programming language for example Python or R.

ADD COMMENT
0
Entering edit mode

And in case you don't know where to start, check out Rosalind

ADD REPLY
3
Entering edit mode
3.4 years ago
Badran ▴ 70

Recommendations

There are several things you need to learn; as a fellow biologist who started computational work at graduate level, here are the most important things you will need to learn:

  1. Programming language: R or python. Python is a general purpose programming language that is used in many different domains. That being said, I would recommend you learn R first because bioinformatics/genomic data analysis is usually done in R

    • R has the Bioconductor suite of packages, which are open-source and are designed for analysis of all forms of -omics data. Python has some equivalents but the entire suite is far from complete.

    • R has better data science packages; tidyverse and dataframes are game changers in comparison to python's numpy, pandas, and seaborn.

    • You will learn python anyways sooner or later. A lot of good packages are written in python, and you will have to use them at some point. Furthermore, if you intend to do any machine learning or deep learning, you will have to do it through python (python is dominant in this regard, and R is yet to catch up).

  2. Linux Command Line/Bash Scripting: in RNA-Seq, R is used for the analysis of the final count matrices generated from your raw data. However, the process of generating these final count matrices from the raw data requires the use of dedicated tools, all of which work on the command line and are usually executed on a high computing cluster (HPC) using bash scripts. That is why it is a good idea to gain confidence with the command line, using it to navigate your file system, list the contents of directories, create and delete files and folders, and move files and folders around.

  3. Statistics/Classical Machine Learning: It will definitely be useful to have the basics of statistics and classical machine learning algorithms down; these will help you understand papers better and are generally useful for your work.

  4. Personal Preference: There are 3 main operating systems: windows, linux, and macos. GET MACOS.

    • Windows: has no native support for bash scripting, runs into compatibility issues, and is quite bad in my opinion. I have been a windows user for 6-7 years before starting the switch, and I would say that it is the worst I have used.
    • Linux: All of the packages are compatible with linux because they were developed for linux. Linux is open-source, so you are not paying any money and it will teach you quite a bit about troubleshooting and interacting with your operating system. However, linux has no support for microsoft or adobe, so you can't get them and that's a deal breaker. I have used linux for 6-8 months before giving up on it, and although I loved the system, I couldn't sink any more time trying to get work done on it; it's a good system for programming and developing code, but anything else is dead on that system.
    • MacOS: is basically apple's version of linux; it looks nice, it runs really well, it is compatabile with most packages and software, and it has excellent support for Microsoft and adobe. I switched to it 2 months ago, and I am never going back. If you can't afford to buy a macbook or a mac mini with good specs, you can build a pc and install macos on it (hackintosh).

Recommended Resources:

  • Udemy R course: Perfect to start you with R and its data structures.

  • R for Data Science: The golden manual of data science with R. Can't recommend it highly enough.

  • Introduction to Statistical Learning: This is a good textbook to teach the basics of machine learning theory with coding examples in R.

  • As you go past the basics and learn more and more, there will be no textbooks; you will have to read package tutorials and guides and read documentation, but you will learn how to do this effectively along the way as you progress through your computational journey.

ADD COMMENT
2
Entering edit mode

Windows: has no native support for bash scripting, runs into compatibility issues, and is quite bad in my opinion. I have been a windows user for 6-7 years before starting the switch, and I would say that it is the worst I have used.

This is just wrong, in many ways. WSL makes things simple for bioinformatics. I've yet to run into a Windows-specific compatibility issue. "Bad" here is pretty subjective, and I'm glad you found a system that works well for you, but it's perfectly possible to do bioinformatics/comp bio work on a Windows machine with few OS-specific issues.

Pick whichever OS you prefer - it's unlikely to be a real hinderance to your work, especially since most folks relegate computationally heavy jobs to cloud or institutional computing resources (clusters, etc).

ADD REPLY
0
Entering edit mode

I believe that that the author was unaware of WSL, hence the statement: "no native support for bash", Most likely they are not passing judgment of WSL in general.

In general Windows WSL is almost on par with Linux - the usability problems come from not being able to execute programs with graphics based output.

A better correction would be to state that Windows based system may be operated as a fully operational command line Linux machines.

ADD REPLY
0
Entering edit mode

In general Windows WSL is almost on par with Linux - the usability problems come from not being able to execute programs with graphics based output.

fwiw, this can also be circumvented in most cases by using a terminal with X11 forwarding (like MobaXterm, which works out of the box with WSL).

ADD REPLY
1
Entering edit mode

GET MACOS.

Yeah, no.

I am really disheartened to see recommendations for a closed-source, walled-garden, ultra-capitalistic ecosystem on Biostars of all places. Modern Linux flavors are very newbie-friendly (especially Ubuntu), and there is absolutely no reason to force someone onto Apple's products. Especially not because of bioinformatics of all things.

However, linux has no support for microsoft or adobe, so you can't get them and that's a deal breaker.

What are these monolithic "Microsoft" and "Adobe" entities that you need? Office? Adobe Reader?!

ADD REPLY

Login before adding your answer.

Traffic: 1819 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6