Herald:The Biostar Herald for Wednesday, February 19, 2025
0
2
Entering edit mode
5 weeks ago
Biostar 3.4k

The Biostar Herald publishes user submitted links of bioinformatics relevance. It aims to provide a summary of interesting and relevant information you may have missed. You too can submit links here.

This edition of the Herald was brought to you by contribution from Istvan Albert, and was edited by Istvan Albert,


@leviwaldron1.bsky.social on Bluesky (bsky.app)

Excited to share our MicrobiomeBenchmarkData preprint, addressing compositional data analysis for microbiome DA. This Bioconductor package features 3 datasets with biological ground truths. Spoiler: simple methods like LEfSe, Wilcoxon, RNA-seq methods performed well

submitted by: Istvan Albert


Commonly used compositional data analysis implementations are not advantageous in microbial differential abundance analyses benchmarked against biological ground truth | bioRxiv (www.biorxiv.org)

Previous benchmarking of differential abundance (DA) analysis methods in microbiome studies have employed synthetic data, simulations, and "real data" examples, but to the best of our knowledge, none have yet employed experimental data with known "ground truth" differential abundance. A key debate in the field centers on whether compositional methods are necessary for DA analysis, which is challenging to answer due to the lack of ground truth data. To address this gap, we created the Bioconductor data package MicrobiomeBenchmarkData, featuring three microbiome datasets with established biological ground truths: 1) diverse oral microbiomes from supragingival and subgingival plaques, expected to favor aerobic and anaerobic bacteria, respectively, 2) low-diversity microbiomes from healthy vaginas and bacterial vaginosis, conditions that have been well-characterized through cell culture and microscopy, and 3) a spike-in dataset with constant, known absolute abundances of three bacteria. We benchmarked 17 DA approaches and demonstrated that compositional DA methods are not beneficial but rather lack sensitivity, show increased variability in constant-abundance spike-ins, and, most surprisingly, more frequently produce paradoxical results with DA in the wrong direction for the low-diversity microbiome.

submitted by: Istvan Albert


GitHub - rhpvorderman/sequali: Fast sequencing data quality metrics (github.com)

This tool should be have been called fasterQC

In my evaluation of a 30GB gzipped FASTQ file with 370 million reads:

  • fastqc took 38 minutes
  • sequali took 6 minutes

the plots in sequali make a lot more sense as well.

submitted by: Istvan Albert


Just a moment... (academic.oup.com)

Sequali was developed to provide sequencing quality control for both short- and long-read sequencing technologies. It features adapter search, overrepresented sequence analysis, and duplication analysis and supports FASTQ and uBAM inputs. It is significantly faster than comparable sequencing quality control programs for both short- and long-read sequencing technologies.

submitted by: Istvan Albert


@robp.bsky.social on Bluesky (bsky.app)

Big update to our tool alevin-fry-atac, for preprocessing of scATAC-seq data. The main improvements center around optimizations to mapping speed, reaching almost 3x the speed of chromap at 32 threads. Some opts are general & can be backported to sshash!

Link to the paper: Alevin-fry-atac enables rapid and memory frugal mapping of single-cell ATAC-seq data using virtual colors for accurate genomic pseudoalignment

submitted by: Istvan Albert


Want to get the Biostar Herald in your email? Who wouldn't? Sign up righ'ere: toggle subscription

herald • 288 views
ADD COMMENT

Login before adding your answer.

Traffic: 1789 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6