The Biostar Herald publishes user submitted links of bioinformatics relevance. It aims to provide a summary of interesting and relevant information you may have missed. You too can submit links here.
This edition of the Herald was brought to you by contribution from Istvan Albert, and was edited by Istvan Albert,
No, tSNE/UMAP don't 'preserve neighborhoods'.
Fig 2d shows that tSNE and UMAP get the large majority of k nearest neighbors wrong on all datasets. And this is 'recall' not even of true cell positions but just the logp1 mappings of the raw data! https://t.co/Hz4VTyDpjk
— NimwegenLab (@NimwegenLab) April 2, 2024
No, tSNE/UMAP don't 'preserve neighborhoods'.
Fig 2d shows that tSNE and UMAP get the large majority of k nearest neighbors wrong on all datasets. And this is 'recall' not even of true cell positions but just the logp1 mappings of the raw data! https://t.co/Hz4VTyDpjk
submitted by: Istvan Albert
GitHub - lh3/minipileup: Simple pileup-based variant caller (github.com)
Minipileup is a simple pileup-based variant caller. It takes a reference FASTA and one or multiple alignment BAM as input, and outputs a multi-sample VCF along with allele counts:
Submitter's note: countless times, you just want to know what the alleles are at a position; the task is surprisingly challenging to do correctly. I wrote N+1 versions of various ad-hoc Python-based parsers, and eventually I found that all had subtle bugs and errors in them.
submitted by: Istvan Albert
Assessing GPT-4 for cell type annotation in single-cell RNA-seq analysis | Nature Methods (www.nature.com)
Here we demonstrate that the large language model GPT-4 can accurately annotate cell types using marker gene information in single-cell RNA sequencing analysis. When evaluated across hundreds of tissue and cell types, GPT-4 generates cell type annotations exhibiting strong concordance with manual annotations. This capability can considerably reduce the effort and expertise required for cell type annotation. Additionally, we have developed an R software package GPTCelltype for GPT-4’s automated cell type annotation.
submitted by: Istvan Albert
Batch correction methods used in single cell RNA-sequencing analyses are often poorly calibrated | bioRxiv (www.biorxiv.org)
We compared seven widely used method used for batch correction of scRNA-seq datasets. We present a novel approach to measure the degree to which the methods alter the data in the process of batch correction, both at the fine scale comparing distances between cells as well as measuring effects observed across clusters of cells. We demonstrate that many of the published method are poorly calibrated in the sense that the process of correction creates measurable artifacts in the data.
submitted by: Istvan Albert
In this preprint https://t.co/IYUCt8DScv with @sindri_e we compared seven widely used methods for batch correction of single cell RNA-seq data. We found that all but one of the methods introduce batch effects when there are none. 1/N
— Pall Melsted (@pmelsted) March 23, 2024
In this preprint https://t.co/IYUCt8DScv with @sindri_e we compared seven widely used methods for batch correction of single cell RNA-seq data. We found that all but one of the methods introduce batch effects when there are none. 1/N
— Pall Melsted (@pmelsted) March 23, 2024submitted by: Istvan Albert
Want to get the Biostar Herald in your email? Who wouldn't? Sign up righ'ere: toggle subscription