The Biostar Herald publishes user submitted links of bioinformatics relevance. It aims to provide a summary of interesting and relevant information you may have missed. You too can submit links here.
This edition of the Herald was brought to you by contribution from Istvan Albert, and was edited by Istvan Albert,
grepq: A Rust application that quickly filters FASTQ files by matching sequences to a set of regular expressions | bioRxiv (www.biorxiv.org)
grepq is a Rust application that quickly filters FASTQ files by matching sequences to a set of regular expressions. grepq is designed with a focus on performance and scalability, is easy to install and easy to use, enabling users to quickly filter large FASTQ files, and to update the order in which patterns are matched against sequences through an in-built tune command. grepq is open-source and available on GitHub and Crates.io
submitted by: Istvan Albert
genescf (GeneSCF) · GitHub (github.com)
In this study, we focused on designing a command-line tool called GeneSCF (Gene Set Clustering based on Functional annotations), that can predict the functionally relevant biological information for a set of genes in a real-time updated manner. It is designed to handle information from more than 4000 organisms from freely available prominent functional databases like KEGG, Reactome and Gene Ontology. We successfully employed our tool on two of published datasets to predict the biologically relevant functional information. The core features of this tool were tested on Linux machines without the need for installation of more dependencies. Publication: https://bmcbioinformatics.biomedcentral.com/articles/10.1186/s12859-016-1250-z
submitted by: Istvan Albert
Steven Salzberg on Twitter (x.com)
Is this real, or is it excessive AI hype? New paper claims that an AI foundation model achieves experimental-level accuracy in predicting gene expression even in previously unseen cell types." I'll read the paper, but don't believe it for a second.
submitted by: Istvan Albert
GitHub - calico/borzoi: RNA-seq prediction with deep convolutional neural networks. (github.com)
Borzoi was trained on a large set of RNA-seq experiments from ENCODE and GTEx, as well as re-processed versions of the original Enformer training data (including ChIP-seq and DNase data from ENCODE, ATAC-seq data from CATlas, and CAGE data from FANTOM5).
submitted by: Istvan Albert
Predicting RNA-seq coverage from DNA sequence as a unifying model of gene regulation | Nature Genetics (www.nature.com)
Here, we introduce Borzoi, a model that learns to predict cell-type-specific and tissue-specific RNA-seq coverage from DNA sequence. Using statistics derived from Borzoi’s predicted coverage, we isolate and accurately score DNA variant effects across multiple layers of regulation, including transcription, splicing and polyadenylation.
submitted by: Istvan Albert
Want to get the Biostar Herald in your email? Who wouldn't? Sign up righ'ere: toggle subscription