Newbie Rant here: Am I the only one who thinks that the learning curve for the NCBI databases and tools is unnecessarily steep?
I want to obtain and use the information, not take hours to locate it.
There...I feel better now!
Newbie Rant here: Am I the only one who thinks that the learning curve for the NCBI databases and tools is unnecessarily steep?
I want to obtain and use the information, not take hours to locate it.
There...I feel better now!
Am I the only one who thinks that the learning curve for the NCBI databases and tools is unnecessarily steep?
Maybe? One of the skills a bioinformatician needs is on-demand learning. Focus on knowing what tools can do, not learning how to do everything before you start using them. NCBI is one of the easiest databases out there and its results are as clear as can be.
I want to obtain and use the information, not take hours to locate it.
What information? Bioinformatics databases are not service sites like Amazon or eBay, they are data sharing sites. We are not entitled to finding results relevant to us quickly. As time goes, people understand how these tools work and how to get what we need out of them the best way possible. If you have an idea on how to do something better, do it and we will support you. But remember, new tools don't always help:
Have you looked at the learning resources that NCBI has. Plenty of YouTube videos if you prefer to learn that way.
Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Well, everybody wants to hit Enter and get the desired result:) Unfortunately (of fortunately?) it doesn't work like that, you have to put some (sometimes substantial) efforts.
Seems to me that there is a golden opportunity for private sector to develop a software tool that does exactly this. I state what question I want to answer, give the data I have, and hit Enter. The program does the rest--chooses the best search parameters based on the query and produces the most relevant output.
this has been the business model pushed by dozens of failed bioinformatics startups since 1996
They have all approached it the wrong way. What would really be a success is: someone gives the software the answer one wants, and hit enter. Then, the software finds the question and the data to satisfy the answer.
We have that, it's called simulation software :-)
Computer, can you give me the ultimate answer to life the universe and everything?
This actually makes a really good point. Multiple tools exist to get tasks done, and we take the call on which of them to use based on context and experience. When such a complex problem is given to computers, especially when we do not have well established and agreed upon training sets, machines will take too long to produce an answer that makes sense to the algorithm but not to the research world in general, AKA 42.
Fun fact: the ASCII code of 42 is
*
, or the "everything" wildcard.You want private sector to develop an AI bioinformatician? Because that's what you're describing - an entity that takes the right call on methods to use given data and an idea on the kind of results you're looking for.
Yes, I think algorithms could do this. Sure, lots of work. But what a product! .gov is not going to have the incentive to do this.
You weren't lying, you are new :-)
Current AI technology cannot be trusted to do our job. They can't even be trusted to accurately sequence a genome, and that's just one machine doing one task.
Well said Ram! Full agreement
This is a complex issue I think. Of course it sounds sweet, but this software would cost a lot (it solves important problems, saves time and very complex), while the majority of scientific organization in the world (I don't mean top Western institutions) are poor :) And I guess that's why as mentioned by @Jeremy, we see a quite limited number of really successful and big bioinformatics companies today (and most of them are quite recent because of NGS revolution). We can keep discussing.
I've seen the 'products' of these companies and the analyses are invariably conducted incorrectly, with sloppy figures being produced, in addition. On top of this, they charge you both an arm and a leg.
Most of the bigger SaaS players (e.g. SBG, DNANexus) have taken a different approach than selling "we can make your analysis easy" to every PI and have made most of their money through partnerships either with government or pharma.
I partially agree. Learning the nuances of the differences between databases can take time, but for 'newbies' its typically enough to get comfy with (t)BLAST(n/p) and probably PubMed.
It doesn't get much simpler than a search box and a database though (which is exactly how NCBI is built), so I'm afraid there is no option other than to suck it up and get stuck in!
Maybe that's the problem. The input part seems simple, but interpreting the output is onerous. I want to spend time on science, not the technology. But, unfortunately, the reality is that the scientists and the technicians both need to understand the other. A nice partnership if you can get it.
Is it?
Personally, I think the output of a BLAST search for instance (I'd wager probably the single most used feature of NCBI's infrastructure, maybe after PMC), is incredibly intuitive. You even get a picture! And I'm not saying that as a now semi-seasoned bioinformatician/molecular biologist. I remember learning what BLAST was for the first time in my bachelors degree as if it was yesterday and it just 'clicked'. Granted, this is just a personal anecdote, and not everyone thinks the same way, but I really don't think you could ask for much more intuitive output.
My main criticism of the NCBI interface even now, is that the 'click through-y-ness' is opaque sometimes for sure. Trying to download all the genomes for a particular BioProject, or taxon or whatever, can require some pretty intimate understanding of how NCBI structures their data.
To be fair things you are listing are not simple queries/tasks. They inherently require clarity about what you are trying to achieve. Designing user interfaces (that are intuitive) is an art. I am sure NCBI keeps evolving those overtime. Because NCBI is so large a repository the regression testing they need to do must be a task in itself to ensure that things don't break. Since BLAST is their most popular tool they do keep that in pretty good shape/current/fresh.
Oh I quite agree, but even something that one might expect to be a simple process (like getting a full genbank download properly!) can trip you up.
Same issue in wetlab work. The techniques are complex and a field in themselves, but not basic science.
Primary reason I chose in silico, to get away from the burden of wetlab work. Unfortunately, I find its as bad or worse. Interested in others views on this.
So both wet lab and in silico work are too complex for you. Well, I have some bad news for you then.
Nonsense, this guy is executive material. He'll be my boss someday.
"Just built the pipeline already man and do your bioinformatic analysis."
"What my team has developed under my expert leadership is a comprehensive solution for data-driven predictive personalized medicine and cost savings. Big Data. IoT."
I'm going to be sick.
Don't worry we have an app for that, too.
Can you elaborate on the wet lab point? I'm a wet lab biologist (primarily in fact). If anything I think the wet lab has too much of the opposite problem. The techniques are so old and ingrained in a lot of cases, everyone just buys kits and black boxes everything - it's pretty much impossible to understand the basic science of every tiny thing.
But that's how it has to be. Science is too big now for everyone to be an expert in very much at all.
Yes, I was also primarily wet lab based before branching into bioinformatics. Science in the wet lab has become a 'kit-based', any many of these kits are expensive and do not even work. Companies even sell them to you after the 'sell-by' date. So much money is wasted in research as a result of this, un-necessarily so. If companies actually properly tested their products better, instead of just releasing their own curated white papers, then it may improve.
The interface and use of most NCBI resources is reasonably easy, and with a bit of reading and searching you can even become a "power user" rather quickly.
If you are talking about SRA and SRAtolkit, on the other hand, I wholeheartedly agree.
You forgot to add NCBI unix command line utils (to the list with SRA) :-)