The Biostar Herald publishes user submitted links of bioinformatics relevance. It aims to provide a summary of interesting and relevant information you may have missed. You too can submit links here.
This edition of the Herald was brought to you by contribution from Istvan Albert, and was edited by Istvan Albert,
The difference in @10xGenomics' Cell Ranger's default between version 6 and 7 is discussed in this thread, but it's such a big deal that it's worth its own thread.
tl;dr: in v7 Cell Ranger changed how it produces the gene count matrix leading to a huge difference in results. 1/ https://t.co/OarIGuxhCC
— Lior Pachter (@lpachter) April 7, 2024
The difference in @10xGenomics' Cell Ranger's default between version 6 and 7 is discussed in this thread, but it's such a big deal that it's worth its own thread.
tl;dr: in v7 Cell Ranger changed how it produces the gene count matrix leading to a huge difference in results. 1/ https://t.co/OarIGuxhCC
submitted by: Istvan Albert
The art of seeing the elephant in the room: 2D embeddings of single-cell data do make sense | bioRxiv (www.biorxiv.org)
More appropriate metrics quantifying neighborhood and class preservation reveal the elephant in the room: while t-SNE and UMAP embeddings of single-cell data do not represent high-dimensional distances, they can nevertheless provide biologically relevant information.
submitted by: Istvan Albert
SeqKit2: A Swiss army knife for sequence and alignment processing (onlinelibrary.wiley.com)
SeqKit2 represents substantial enhancement through the inclusion of 19 additional subcommands, expanding its overall repertoire to a total of 38 in eight categories. The new subcommands add functionality such as amplicon processing and robust, error-tolerant parsing of sequence records. In addition, three subcommands designed for real-time analysis are added for periodic monitoring of properties of FASTQ and Binary Alignment/Map alignment records and real-time streaming from multiple sequence files.
submitted by: Istvan Albert
Snakemake workflows for long-read bacterial genome assembly and evaluation (gigabytejournal.com)
In order to automatically run multiple genome assembly and evaluation programs at once, I developed two workflows for the workflow management system Snakemake, which provide end users with an easy-to-run solution for testing various genome assemblies from their sequencing data. Both workflows use the conda packaging system, so there is no need for manual installation of each program.
submitted by: Istvan Albert
The impact of package selection and versioning on single-cell RNA-seq analysis | bioRxiv (www.biorxiv.org)
We investigate in detail the algorithms and methods underlying Seurat and Scanpy and find that there are, in fact, considerable differences in the outputs of Seurat and Scanpy. The extent of differences between the programs is approximately equivalent to the variability that would be introduced by sequencing less than 5% of the reads for scRNA-seq experiments, or by analyzing less than 20% of the cell population.
submitted by: Istvan Albert
The choice of whether to use Seurat or Scanpy for single-cell RNA-seq analysis typically comes down to a preference of R vs. Python. But do they produce the same results? In https://t.co/rVOiR847CY w/ @Josephmrich et al. we take a close look. The results are 👀 1/🧵 pic.twitter.com/PicdqRSCRq
— Lior Pachter (@lpachter) April 5, 2024
The choice of whether to use Seurat or Scanpy for single-cell RNA-seq analysis typically comes down to a preference of R vs. Python. But do they produce the same results? In https://t.co/rVOiR847CY w/ @Josephmrich et al. we take a close look. The results are 👀 1/🧵 pic.twitter.com/PicdqRSCRq
— Lior Pachter (@lpachter) April 5, 2024submitted by: Istvan Albert
tSNE/UMAP preserve neighbors in a "soft" sense:
Yes, tSNE recalls only ~30% of neighbors exactly.
But ~60-80% of the 10 nearest tSNE neighbors are among the 100 nearest true neighbors! So even if some tSNE neighbors are not next-door neighbors, most are "from the same block". https://t.co/JzvvRUpZgq pic.twitter.com/7hxbN9Hw5Y
— Jan Lause 🟦 @janlause.bsky.social🦉 (@JanLause) April 3, 2024
tSNE/UMAP preserve neighbors in a "soft" sense:
Yes, tSNE recalls only ~30% of neighbors exactly.
But ~60-80% of the 10 nearest tSNE neighbors are among the 100 nearest true neighbors! So even if some tSNE neighbors are not next-door neighbors, most are "from the same block". https://t.co/JzvvRUpZgq pic.twitter.com/7hxbN9Hw5Y
submitted by: Istvan Albert
Want to get the Biostar Herald in your email? Who wouldn't? Sign up righ'ere: toggle subscription