Forum:Pat Schloss of Mothur uses this
0
13
Entering edit mode
10.1 years ago

Patrick Schloss is an Associate Professor in the Department of Microbiology and Immunology of the University of Michigan Medical School. He is the author of tools such as DOTUR, SONS and mothur - each an increasingly sophisticated program for analyzing microbial data. At the present time mothur offers a complete package covering just about all the data analysis needs of the microbial ecology community

As the names of the tools indicate Dr. Schloss is a family man - we once enrolled a bioinformatics staff member in the mothur training program ran by Dr. Schloss - amusingly the check had to be made out to Five Kids Farm. Since then the farm has been renamed first to Six Kids Farm then to Seven Kids Farm with the motto: "Growing adults since 2000". As it turns out when not programming and running a scientific lab Dr. Schloss runs a bone fide farm where he lives with his family that is now comprised of seven children. Thus when not running make from the command line Dr. Schloss helps make real food, rides a tractor, operates farm equipment, raises sheep and cows.

Before interviewing Dr. Schloss mothur was a mystery to me. It is a radically different take on bioinformatics- it contains a single executable with no other dependencies that runs from the command line on every platform: Linux, Mac and Windows, yet still exposes a vast utility and functionality. Now I get it - mothur is the tool for people that can't afford to waste time - it dares to forego the accumulated crud of "modern" software engineering and dependency management by being completely self reliant and independently functional. Make no mistake that is a gigantic undertaking - it requires reimplementing all algorithms into simple portable C++ code. But with that it becomes a tool like no other.

Dr. Schloss regularly offers his mothur workshop the next will be held in December 17-19 in Detroit.


Pat Schloss of mothur

How did you get started in bioinformatics?

My BS and PhD are both in biological engineering where my only course in programming was Pascal [stop laughing]. But that one language really helped me learn others. I think it's frequently seen that if you learn one programming/foreign language, it becomes easier to learn additional languages. My progression was Pascal -> Perl -> C++ -> R. Despite learning Pascal as an undergrad, I'm really a self-taught programmer.

When I was a postdoc I had about 500 16S rRNA gene sequences from a single 0.5-g sample of soil. We were working on a poster and my advisor said that it would be nice to have one of those "curvy thingies" (a rarefaction curve). I said, "sure…" Well that simple problem required us figuring out how to cluster like sequences together and then figure out how to do rarefaction. In hindsight, these are pretty simple problems. I had a problem in my mind and that was the critical piece. With that problem in mind, I could pick up "Learning Perl" and go through it thinking of the problem and where various parts of the language might fit in. I would literally do a chapter of the book and then do all of the exercises. I was pretty committed and got through the book in a few weeks. Aside from having the problem in mind, I also benefitted from actually coding as I was learning it. This served to reinforce the concepts and help me memorize what was going on. To learn C++ someone took a different perl script I was working on and wrote it in C++. It was light years faster and I used my perl and their C++ to start to learn C++.

When I talk to biologists about learning to code I encourage them to follow that process:

  1. have a problem in mind;
  2. get a good book and go through it a chapter at a time;
  3. use the language as part of your daily life;
  4. find other people to help you learn.

Oh yeah, that "curvy thing"? Well that turned in to DOTUR, which has now been cited by more than 1,567 publications.

What hardware do you use?

I use a MacBook Pro laptop with an extra monitor. My lab also has a computer cluster that we use for heavy lifting.

What is your text editor?

I mainly use TextWrangler and have been slowly moving to Atom as I experiment with using markdown/R markdown for my writing. If I were a cool kid I suppose I'd learn emacs or vi. For our mothur work, we use Xcode.

What software do you use for your work?

Running a lab there's a lot of boring stuff that I seem to use more than I'd like: word, ical, mail, adium, safari... When I get to actually do science I use mothur, R, git, and the text editors.

What do you use to create plots and charts?

R. 4 years ago I enforced a moratorium on the use of Excel to generate plots in the lab. I'm starting to enforce a moratorium on Prism. You can tell that someone generated plots in Excel/Prism by how awful they look and generally you can tell they were made in R by how good they look.

What do you consider the best language to do bioinformatics with?

That's unfair! It depends. If the code is going out to the masses or needs to be fast, we do it in C++. If it's something I just need done, I'm doing it more and more in R and less and less in R.

I think waaay too much is made of programmer time and not enough is made of user time/frustration. Users do not want to hack your perl or python scripts or download a gazillion dependencies. It's just a pain - I don't even want to do it! If you multiply the time savings for the user by the number of papers citing mothur (>2000), you will more than make up for the developer time. I tell people that it doesn't really matter what language you use as long as you learn a language.

What bioinformatics tools/software do not get enough recognition?

I'm always thinking from the biologists perspective and not from the computer scientist perspective. With that in mind, I think biologists either don't know or don't appreciate the power of tools like git or literate programming tools like IPython notebooks or the knitr and slidify R packages.

Bioinformaticists could do a great amount of good by helping those developers and making these tools more accessible to biologists.


See all post in this series: https://www.biostars.org/tag/uses-this/

To be notified of a new post in the series follow the first post: Jim Robinson of the Integrative Genomics Viewer (IGV) uses this

uses-this • 7.2k views
ADD COMMENT
1
Entering edit mode

If it's something I just need done, I'm doing it more and more in R and less and less in R.

ADD REPLY
2
Entering edit mode

Ugh, the second R should have been Perl.

ADD REPLY
3
Entering edit mode

amusingly the reason I never caught that is because I always interpreted that sentence as something more profound - that we need to write a lot less R code and rely on what R does internally. For example instead of writing lengthy loops, accessing and mutating elements in a vector why not broadcast a function with apply etc.

Plus there is the concept of https://en.wikipedia.org/wiki/Ephemeralization that also popped into my mind.

Turns out it was just a typo. ;-) I still like the "more and more in R and less and less in R"

ADD REPLY
1
Entering edit mode
ADD REPLY
2
Entering edit mode

Thanks, because I didn't really get that.

ADD REPLY
1
Entering edit mode

Hi, I´d want to use Mothur page, but it does not work. I don´t know what it´s happening. Anybody could tell me what pag I can visit. Thanks a lot

Deb!

ADD REPLY
1
Entering edit mode

It's down for me too - perhaps they're just rebooting their servers?

http://downforeveryoneorjustme.com/http://www.mothur.org/

EDIT: Also the "X" of "Y" uses this series was super cool. I really miss these.

ADD REPLY
1
Entering edit mode

Its been on and off for a few days. I was able to access wiki once on Friday, and mostly off otherwise

ADD REPLY
1
Entering edit mode

Recent tweet about this from Pat Schloss

ADD REPLY
1
Entering edit mode

Even if his website wasn't down, what a crappy thing for someone to say :/

ADD REPLY
2
Entering edit mode

Why crappy? I read this as a self-deprecating reference to his own inability to balance work and life (possibly due to the wee one he's holding in the photo).

ADD REPLY
1
Entering edit mode

I interpreted it as that he fixed the issues and sacrificed his free time to do so.

ADD REPLY
1
Entering edit mode

He could have said 'my work/life balance has been bad, ergo less time to fix things like this', but he went with 'your work/life balance is good, ergo leave me alone." It's just an insensitive way of phrasing it. Perhaps i'm overly sensitive since my work-life balance this week has been really bad. I think i've slept 4 days in the last 2 weeks, due to an awful cold/cough, however I have to teach a 1 week python course at the university (sick), and finish up and hand in my thesis to my PI at my institute. So it's like from the moment i wake up to the moment i fall asleep work work work, and then this guy tells me i've been work life balancing so my opinion on mothur doesn't count. If he'd phrased it the first way i'd have been extra sympathetic, but phrasing it this way was just kind of, i don't know, dismissive? Revisiting it now i'm realising just how irrelevant it is though :P

ADD REPLY
1
Entering edit mode

Please stop posting in this thread about the mothur site being down.

That is unrelated to the main thread.

ADD REPLY

Login before adding your answer.

Traffic: 1796 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6