The Biostar Herald publishes user submitted links of bioinformatics relevance. It aims to provide a summary of interesting and relevant information you may have missed. You too can submit links here.
This edition of the Herald was brought to you by contribution from Istvan Albert, manaswwm, and was edited by Istvan Albert,
Protein Map | Innophore (humanproteinmap.com)
Innophore's Human Protein Map helps you compare a multidimensional 3D representation of your drug-target site with over 400,000 binding sites in the human body.
submitted by: Istvan Albert
SeqFu docs · SeqFu - FASTX Sequence Utilities (telatin.github.io)
A general-purpose program to manipulate and parse information from FASTA/FASTQ files, supporting gzipped input files. Includes functions to interleave and de-interleave FASTQ files, to rename sequences and to count and print statistics on sequence lengths. SeqFu is available for Linux and MacOS.
submitted by: Istvan Albert
@robp.bsky.social on Bluesky (bsky.app)
Primer: Counting is not easy: Assessing and quantifying uncertainty in abundance inferences from high-throughput sequencing data
Rob Patro University of Maryland - College Park -- Department of Computer Science and Center for Bioinformatics and Computational Biology
From gene-level read counts in RNA-seq analysis through species-level read counts in metagenomic analysis, count data are often treated as direct observations to be statistically modeled for downstream analyses (like differential testing). Yet, due to fundamental read-to-target ambiguity in the underlying data, direct counts can often not be observed. To help overcome this difficulty, methods have been developed which posit generative models in which the abundances of interest are key parameters, directly related to latent variables encoding read-to-target assignments. Much effort has been expended to make these models accurate and efficient for inference. Nonetheless, they often return point estimates (usually maximum likelihood or maximum a posteriori) where the degree of uncertainty can vary widely between different parameters, and the posterior distributions of these parameters can be correlated in complex ways. In this background talk, I will discuss the challenges posed by read-to-target ambiguity, generative models for abundance estimation developed to address these challenges, methods for statistical inference in these models, and methods for estimating and propagating quantification uncertainty in these models.
Meeting: Uncertainty-aware analysis of RNA-Seq data using a tree-based framework
Noor Pratap Singh University of Maryland -- Department of Computer Science and Center for Bioinformatics and Computational Biology
The length of a short read is typically much smaller than that of a spliced transcript, making it difficult to determine the true locus of origin in eukaryotic transcriptomes, especially since transcripts can share overlapping sequences. This ambiguity introduces uncertainty in the abundance estimation of certain transcripts, which in turn affects downstream analyses such as differential expression testing. To address these challenges, we introduce a data-driven tree-based framework that incorporates uncertainty into RNA-seq data analysis.
In the first part of the talk, I will discuss existing approaches for handling uncertainty and their limitations in RNA-seq data analysis before introducing TreeTerminus. TreeTerminus constructs a hierarchical, tree-like structure from a given set of RNA-seq samples, where leaf nodes represent individual transcripts and internal nodes correspond to aggregated transcript groups. As one ascends the tree, uncertainty decreases, providing a flexible framework for analyzing data at different levels of resolution, depending on the analysis of interest.
submitted by: Istvan Albert
Insufficient evidence for natural selection associated with the Black Death | Nature (www.nature.com)
"Matters arising" added to a major study claiming to identify loci associated with resistance to the plague in the London population
submitted by: manaswwm
x.com (x.com)
Why is it so hard to rewrite a genome?
Excited to share this article and interviews by Michael Eisenstein (@Nature ), which summarizes the current state, challenges, and recent developments in whole-genome synthesis and bottom-up genome design.
https://www.nature.com/articles/d41586-025-00462-z
Synthetic biologists have the know-how and ambition to retool whole genomes. But the hidden complexity of biological systems continues to surprise them.
submitted by: Istvan Albert
Want to get the Biostar Herald in your email? Who wouldn't? Sign up righ'ere: toggle subscription