Question

Herald:The Biostar Herald for Monday, November 04, 2024

5

Entering edit mode

5 months ago

Biostar 3.4k

The Biostar Herald publishes user submitted links of bioinformatics relevance. It aims to provide a summary of interesting and relevant information you may have missed. You too can submit links here.

This edition of the Herald was brought to you by contribution from Istvan Albert, and was edited by Istvan Albert,

x.com (x.com)

The importance of peer-reviewed bioinformatics methods: a short rant about a recent paper by some leading scientists in my field. Most scientists would agree that well-engineered computational methods are critically important in genomics... 1/8

submitted by: Istvan Albert

Winsorization greatly reduces false positives by popular differential expression methods when analyzing human population samples | Genome Biology | Full Text (genomebiology.biomedcentral.com)

A recent study found severely inflated type I error rates for DESeq2 and edgeR, two dominant tools used for differential expression analysis of RNA-seq data. Here, we show that by properly addressing the outliers in the RNA-Seq data using winsorization, the type I error rate of DESeq2 and edgeR can be substantially reduced, and the power is comparable to Wilcoxon rank-sum test for large datasets. Therefore, as an alternative to Wilcoxon rank-sum test, they may still be applied for differential expression analysis of large RNA-Seq datasets.

Editors note:

Three papers from Genome Biology:

First claims that DeSeq and edger suffer from inflated FDRs and that we should all be using Wilcoxon rank tests instead
The second states that the inflated FDRs reported in the first paper is an artifact of incorrect data generation and that the Wilcoxon test is actually worse
Third paper states that we can fix the the inflated FDRs reported in the first paper to make them like the Wilcoxon tests by applying a winsorization - an outlier replacement strategy

... there you have it - even less clarity than before

submitted by: Istvan Albert

Exaggerated false positives by popular differential expression methods when analyzing human population samples | Genome Biology | Full Text (genomebiology.biomedcentral.com)

When identifying differentially expressed genes between two conditions using human population RNA-seq samples, we found a phenomenon by permutation analysis: two popular bioinformatics methods, DESeq2 and edgeR, have unexpectedly high false discovery rates. Expanding the analysis to limma-voom, NOISeq, dearseq, and Wilcoxon rank-sum test, we found that FDR control is often failed except for the Wilcoxon rank-sum test. Particularly, the actual FDRs of DESeq2 and edgeR sometimes exceed 20% when the target FDR is 5%. Based on these results, for population-level RNA-seq studies with large sample sizes, we recommend the Wilcoxon rank-sum test.

submitted by: Istvan Albert

Neglecting the impact of normalization in semi-synthetic RNA-seq data simulations generates artificial false positives | Genome Biology | Full Text (genomebiology.biomedcentral.com)

A recent study reported exaggerated false positives by popular differential expression methods when analyzing large population samples. We reproduce the differential expression analysis simulation results and identify a caveat in the data generation process. Data not truly generated under the null hypothesis led to incorrect comparisons of benchmark methods. We provide corrected simulation results that demonstrate the good performance of dearseq and argue against the superiority of the Wilcoxon rank-sum test as suggested in the previous study.

submitted by: Istvan Albert

Using Bactopia with AllTheBacteria Assemblies - Bactopia (bactopia.github.io)

AllTheBacteria (ATB) is a collection of nearly 2,000,000 bacterial assemblies. In this post you'll learn how to use Bactopia to seamlessly analyze these assemblies with the available Bactopia Tools.

submitted by: Istvan Albert

Want to get the Biostar Herald in your email? Who wouldn't? Sign up righ'ere: toggle subscription

herald • 982 views

ADD COMMENT • link updated 5 months ago by Dunois ★ 2.9k • written 5 months ago by Biostar 3.4k

4

Entering edit mode

just FYI, anyone that doesn't have an twitter/X login can no longer view x.com threads. it would be nice if everyone left that toxic platform but in the meantime, it would also be nice if we could see the threads in some other back up form

ADD REPLY • link 5 months ago by cmdcolin ★ 4.2k

1

Entering edit mode

it would be nice if everyone left that toxic platform but in the meantime,

The platform formerly known as Twitter is hardly appropriate for being linked to from a forum like this any longer given the readily apparent supremacist and regressive beliefs of the platform's owner and their supporters.

ADD REPLY • link 5 months ago by Dunois ★ 2.9k

0

Entering edit mode

What's wrong with using Twitter for bioinformatics? If you don't like someone's political views, just don't read their posts.

ADD REPLY • link 5 months ago by shelkmike ★ 1.5k

1

Entering edit mode

tl;dr Twitter is full of Nazis and other supremacists whose abhorrent views are being shoved down everyone's throats whether these other people want this or not.

If you don't like someone's political views, just don't read their posts.

This is no longer really possible on the platform formerly known as Twitter, given that supremacist and regressive broadcasts are forced into the feeds of all users, whether they want it or not. Please refer to relevant reporting here for example: https://fortune.com/2024/10/30/study-shows-elon-musk-tweets-pro-trump-appear-x-users-feeds-within-2-sessions/ .

I will also note that the "political opinions" you suggest one tolerate, as it pertains to the supremacist and regressive beliefs peddled by Twitter's owner and their supporters, are not (and never were) benign. What they collectively peddle is also anti-scientific, and seeks to encourage discriminating against individuals on the basis of (essentially harmless) traits such as gender and phenotypic make up.

That is not the kind of discourse, nor set of outcomes, that people seeking to engage in the pursuit of science should seek to even inadvertently support. These are not "opinions" that should simply be ignored and must instead be condemned and protested. Not participating in platforms and spaces hijacked to spread these supremacist and regressive beliefs is one form of condemnation and protest.

We should all be trying to come together, despite our differences (chosen or assigned at random) and not support those that seek to split us apart on the basis of such differences.

ADD REPLY • link 5 months ago by Dunois ★ 2.9k

0

Entering edit mode

The second states that the inflated FDRs reported in the first paper is an artifact of incorrect data generation and that the Wilcoxon test is actually worse

Is this the second paper: https://doi.org/10.1186/s13059-024-03231-9 ?

Also, is Winsorization here basically trading false positives for false negatives?

ADD REPLY • link 5 months ago by Dunois ★ 2.9k