Question

What Are The Obsolete Research Topics In Bioinformatics?

28

Entering edit mode

13.3 years ago

Sirus ▴ 820

Hi all,

Well is seems maybe strange to ask about obsolete research topics in bioinformatics, but I think for a beginner or a new researcher it is a helpful thing to know obsolete topics. What I mean by obsolete is research topics which are not of big interest.

• 14k views

ADD COMMENT • link updated 13.3 years ago by Pals ★ 1.3k • written 13.3 years ago by Sirus ▴ 820

7

Entering edit mode

whatever this years list is, reappraise it in 5 years - when this years obsolete list is big news

ADD REPLY • link 13.3 years ago by Russh ★ 1.2k

2

Entering edit mode

Interest in topics that are not interesting is indeed strange ;)

ADD REPLY • link 13.3 years ago by Neilfws 49k

2

Entering edit mode

I think this should be community wiki probably--no right answer.

ADD REPLY • link 13.3 years ago by Mary 11k

score 14 · Answer 1 · 2011-09-03

14

Entering edit mode

13.3 years ago

Lars Juhl Jensen 11k

The first thing that comes to my mind is protein secondary structure prediction. It seems pretty clear to me that no big performance improvements have happened in the last 5 years (if not 10 years). It seems to have reached a plateau already years ago, so making a new secondary structure prediction method would seem an obsolete research topic.

ADD COMMENT • link 13.3 years ago by Lars Juhl Jensen 11k

5

Entering edit mode

On the other hand there have been plenty of new tools to predict protein 'un'structure. Disprot, IUPred, RONN, etc all predict (or catalogue) structural properties of proteins and are picking up quite a few citations. Admittedly it is the lack of secondary structure they are predicting but isn't it kinda still the same?

ADD REPLY • link 13.3 years ago by Niallhaslam 2.3k

5

Entering edit mode

Actually, from the CASP (http://predictioncenter.org/) results of the last years, there's definitely been an upwards trend... I think the field is still somewhat alive. Not to forget the docking and interaction-prediction areas...

ADD REPLY • link 13.3 years ago by Benjamin Schuster-Boeckler ▴ 80

score 13 · Answer 2 · 2011-09-06

13

Entering edit mode

13.3 years ago

lh3 33k

Short-read (<50bp) alignment algorithms. We will not need to align short reads as sequence reads keep getting longer. It is interesting to see how this subfield gets nearly saturated and then fades away in only two years. A few other NGS related methods will be ended up like this. Algorithm development is data driven, while NGS is moving too fast.
I used to do a brief review a few years ago about clustering algorithms which are mostly related to microarray data and homology identification. My impression that time was we do not need a new microarray data clustering algorithm (homology clustering may still have some room). Nonetheless, I do not really work with microarray data, so could be wrong on this point.

ADD COMMENT • link 13.3 years ago by lh3 33k

4

Entering edit mode

Reads < 50bp might still be used for microRNAs libraries for example. These read-lengths will therefore still be used. Nonetheless I don't know if there is a huge needs for new algorithms for this peculiar kind of data.

ADD REPLY • link 13.3 years ago by Philippe ★ 1.9k

3

Entering edit mode

Short-read alignment can be useful for other applications as well, such as in primer design. But the current tools are sufficient.

ADD REPLY • link 13.3 years ago by lh3 33k

score 11 · Answer 3 · 2011-09-04

11

Entering edit mode

13.3 years ago

Lars Juhl Jensen 11k

A second unrelated obsolete research topic just occurred to me: development of improved microarray normalization methods. Mind you, I fully appreciate the importance of good normalization methods - they can make a huge difference in terms of what one can get out of a microarray study.

However, I consider it obsolete to develop new microarray normalization methods for two reasons:

The methods seem to have reached the limit of what is possible. Methods have for 10 years been able to correct for just about any systematic biases in microarray data, be that the amount of sample loaded, dye effects, print tip effects, and other spacial effects.
Microarrays are to a large extend being replaced by RNA-seq methods, which only adds to obsoleteness of developing new methods to normalize the data.

ADD COMMENT • link 13.3 years ago by Lars Juhl Jensen 11k

5

Entering edit mode

I agree with the normalization problem being moot. I strongly disagree with "Microarrays are to a large extend being replaced by RNA-seq methods". This idea is perpetuated by software people, and hardcore technophiles. While microarray software development is clearly being replaced by RNA-seq development, this hardly applies to the application level (where results count, not coolness). I would estimate the number of microarrays over RNA-seq experiments to still be at 1000:1. This might change, though. (Disclosure: I earn money with analysing microarray data)

ADD REPLY • link 13.3 years ago by Lyco ★ 2.3k

1

Entering edit mode

Agreed, microarrays are not becoming completely replaced by RNA-seq. Still, their importance has decreased due to the competition from RNA-seq :-)

ADD REPLY • link 13.3 years ago by Lars Juhl Jensen 11k

0

Entering edit mode

for thoes interested in this topic, here is a paper that might be interesting: www.stat.berkeley.edu/tech-reports/800.pdf

ADD REPLY • link 13.2 years ago by Ying W ★ 4.3k

Ram · Answer 4 · 2011-09-04

6

Entering edit mode

13.3 years ago

Mary 11k

It might be an interesting exercise to check out the original Pedro's List (some of you will know what that is) and consider which tools have persisted, which evolved, and which have vanished. Of course, some of the vanished ones will have been funding-related and not topic-specific deaths.

ADD COMMENT • link 13.3 years ago by Mary 11k

6

Entering edit mode

Wow, a blast from the past :) Link: http://www.biophys.uni-duesseldorf.de/BioNet/Pedro/research_tools.html

ADD REPLY • link updated 5.2 years ago by Ram 44k • written 13.3 years ago by Neilfws 49k

0

Entering edit mode

Yes, I have been in this field this long....

ADD REPLY • link 13.3 years ago by Mary 11k

0

Entering edit mode

Old days remembered, and there are some Gopher site links too in the Pedro's page. How about looking into some old bioinformatics book like Methods in Enzymology Vol. 183(1982) and 266(1997)? The first book on Bioinformatics I read in the fall of 2000, when I joined an University course was "Sequence Analysis in Molecular Biology: Treasure Trove or Trivial Pursuit" by Gunnar Von Heijne(1987)

ADD REPLY • link 13.3 years ago by Woa ★ 2.9k

0

Entering edit mode

There is also the first NAR Database issue in the mid-90s, and some of the subsequent ones. Another blast from the past: NCBI in the mid-90s I found on the Wayback Machine once: http://blog.openhelix.eu/?p=2577 1997. Oy.

ADD REPLY • link 13.3 years ago by Mary 11k

score 5 · Answer 5 · 2011-09-08

What about methods for multiple testing correction in genome-wide association studies (GWAS)?

On one hand, a p-value of 5e-8 for "genome-wide significance" is the de facto standard for GWAS, based on some work showing that testing all common variants results in about 1 million effectively independent tests. Furthermore, if I'm following up findings in replication studies or in the lab, I'm going to take my top x% of genes/variants, based on whatever I can afford, not whatever the corrected p-value happens to be.

But on the other hand I have attended the IGES meeting for the last 5 years, and there are always several talks and many posters on new methodology for correcting p-values for multiple testing in GWAS.

Ram · Answer 6 · 2011-09-08

4

Entering edit mode

13.2 years ago

Pierre Lindenbaum 164k

I don't think people are working on "protein music" anymore :-)

Ross D. King and Colin G. Angus
PM—Protein music
Comput Appl Biosci (1996) 12(3): 251-252 doi:10.1093/bioinformatics/12.3.251

http://bioinformatics.oxfordjournals.org/content/12/3/251.full.pdf+htm

"We present the program PM for the analysis of protein sequence information using audification..."

ADD COMMENT • link updated 5.2 years ago by Ram 44k • written 13.2 years ago by Pierre Lindenbaum 164k

1

Entering edit mode

There is a recent one from India here :) http://sites.google.com/site/achushome/other-interests#gene-music. You can download the track of Protein music in Kalyani there.

ADD REPLY • link updated 5.2 years ago by Ram 44k • written 13.2 years ago by Khader Shameer 18k

1

Entering edit mode

Well maybe obsolete for proteins, but not for genome alignments:

"ComposAlign was developed to sonify large scale genomic data. The resulting musical composition is based on Common Music and allows the mapping of genes to motives and species to instruments. It enables the researcher to listen to the musical representation of the genomewide alignment and contrasts a bioinformatician's sight-oriented work at the computer." (2009)

http://www2.bioinf.uni-leipzig.de/ComposAlign/examples/

ADD REPLY • link updated 5.2 years ago by Ram 44k • written 13.2 years ago by pmenzel ▴ 310

0

Entering edit mode

:-) .

ADD REPLY • link 13.2 years ago by Pierre Lindenbaum 164k

0

Entering edit mode

Sounds funny :), I wonder how the hemoglobin protein music will look like

ADD REPLY • link 13.2 years ago by Sirus ▴ 820

score 1 · Answer 7 · 2011-09-06

1

Entering edit mode

13.3 years ago

Casey Bergman 18k

Optimal pairwise global alignment, see Needleman and Wunch (1970) for solution.
Optimal pairwise local alignment, see Smith and Waterman (1981) for solution.

ADD COMMENT • link 13.3 years ago by Casey Bergman 18k

1

Entering edit mode

Maybe because finding optimal solutions for pairwise alignment is an obsolete topic... Since the massive data arose, "optimal" is now often replaced by "the lesser evil" !

ADD REPLY • link 13.2 years ago by Manu Prestat 4.1k

1

Entering edit mode

I remember hearing a couple years ago about using specialized hardware (FPGAs/GPUs?) to accelerate these alignments so that they might be feasible so some work might still be done on the implementation side of things and might possibly work for small genomes.

ADD REPLY • link 13.2 years ago by Ying W ★ 4.3k

score 1 · Answer 8 · 2011-09-16

I suppose that in-silico gene prediction (from a sequence as start) for prokaryotes will be going the way of the dodo pretty quickly because

there are already a couple which do the job "good enough"
the ever growing databases at EMBL and NCBI provide a good BLAST base so that assignment is done more and more on similarity.
doing an RNA profiling is comparatively cheap nowadays and gives much better accuracy on the transcript boundaries, from which with a bit of curation one should be able to get pretty good gene boundaries

score 1 · Answer 9 · 2011-10-09

Protein secondary structure prediction using a 3-letter alphabet is pretty dead, but there may be some life left in developing more detailed views of local protein structure.

RNA secondary structure prediction still needs a lot of work, particularly for the more unusual RNA structures that involve things other than pairing on the Watson-Crick edge.

In-silico prediction of prokaryote genes is still far form a solved problem. People have mostly punted on the short proteins, and the RNA genes are still woefully underannotated. The archaeal gene predictions are particularly bad.

score 0 · Answer 10 · 2011-09-06

0

Entering edit mode

13.3 years ago

Woa ★ 2.9k

How about RNA secondary structure prediction ? Is the topic 'hot' anymore?

ADD COMMENT • link 13.3 years ago by Woa ★ 2.9k

0

Entering edit mode

I think it's going to become a "hot" topic again thanks to the discovery of long non-coding RNA's.

ADD REPLY • link 13.3 years ago by Gww ★ 2.7k

0

Entering edit mode

That is a good example for ups-and downs in popularity. I agree with GWW, whenever one discovers non-coding transcripts this becomes interesting. I had a similar problem discussed here (ofc that doesn't really make it a hot topic automatically ;) )

ADD REPLY • link 13.2 years ago by Michael 55k

0

Entering edit mode

It's not dead yet. Predicting RNA secondary structure of mRNAs has also received recent interest (including from my lab).

ADD REPLY • link 13.2 years ago by Qdjm 1.9k

0

Entering edit mode

And what about predicting pseudoknots? or RNA-RNA hybridization? These are still alive.

ADD REPLY • link 12.8 years ago by Asaf 10k

score 0 · Answer 11 · 2012-02-04

De novo gene prediction and genome annotation based on conservation, GC content, HMMs, etc.

These kind of methods were hot when sequencing ESTs and full-length transcripts was prohibitively expensive. Now you can just sequence the transcriptome and let the data tell you what regions of the genome are being transcribed and what the exon/intron structures of those transcripts are (both canonical and minor isoforms).

As RNA-seq and related assembly methods improve, annotating a new genome will become increasingly routine and automated. A lot of genome and cDNA 'finishing' tasks have already been eliminated by these new types of data and related analysis pipelines.

score 0 · Answer 12 · 2012-02-06

0

Entering edit mode

12.8 years ago

Pals ★ 1.3k

The dogma of disordered (regions of) proteins. It seems to be a very interesting yet important topic in structural bioinformatics. Although there are a number of programs, I was unable to find a reliable one.

The importance of this area has been highlighted in Breaking the Protein Rules.

ADD COMMENT • link 12.8 years ago by Pals ★ 1.3k