The Biostar Herald publishes user submitted links of bioinformatics relevance. It aims to provide a summary of interesting and relevant information you may have missed. You too can submit links here.
This edition of the Herald was brought to you by contribution from Istvan Albert, and was edited by Istvan Albert,
GitHub - enormandeau/snplift: Convert genetic variant coordinates across genome assemblies (github.com)
SNPLift takes a tab-delimited file, for example a VCF, with locus positions from a given genome and lifts over these positions so they match a new reference genome. The goal is to rapidly leverage the availability of new genomes without having to re-align all the sample reads and then call and filter the loci again.
submitted by: Istvan Albert
GitHub - sanger-pathogens/snp-sites: Finds SNP sites from a multi-FASTA alignment file (github.com)
Rapidly extracts SNPs from a multi-FASTA alignment.
submitted by: Istvan Albert
From defaults to databases: parameter and database choice dramatically impact the performance of metagenomic taxonomic classification tools | Microbiology Society (www.microbiologyresearch.org)
We found large discrepancies in both the proportion of reads that were classified as well as the number of species that were identified when we used both Kraken2 and MetaPhlAn 3 to classify reads within metagenomes from human-associated or environmental datasets. We then investigated which of these tools would give classifications closest to the real composition of metagenomic samples using a range of simulated and mock samples and examined the combined impact of tool–parameter–database choice on the taxonomic classifications given. This revealed that there may not be a one-size-fits-all ‘best’ choice.
submitted by: Istvan Albert
miniBUSCO: a faster and more accurate reimplementation of BUSCO | bioRxiv (www.biorxiv.org)
Here, we present miniBUSCO, an efficient tool for assessing the completeness of genome assemblies. miniBUSCO utilizes the protein-to-genome aligner miniprot and the datasets of conserved orthologous genes from BUSCO. Our evaluation of the real human assembly indicates that miniBUSCO achieves a 14-fold speedup over BUSCO.
submitted by: Istvan Albert
[2306.03399] Scalable telomere-to-telomere assembly for diploid and polyploid genomes with double graph (arxiv.org)
Despite recent advances in the length and the accuracy of long-read data, building haplotype-resolved genome assemblies from telomere to telomere still requires considerable computational resources. In this study, we present an efficient de novo assembly algorithm that combines multiple sequencing technologies to scale up population-wide telomere-to-telomere assemblies. By utilizing twenty-two human and two plant genomes, we demonstrate that our algorithm is around an order of magnitude cheaper than existing methods, while producing better diploid and haploid assemblies. Notably, our algorithm is the only feasible solution to the haplotype-resolved assembly of polyploid genomes.
submitted by: Istvan Albert
Double kraken2 announcement: (a) new load of indexes at https://t.co/7RZ7nbJHye, and (b) new version v2.1.3 with bug fixes and more efficient masker https://t.co/tyqpwSwvrp. Importantly, the release also includes ... (1/2)
— Ben Langmead (@BenLangmead) June 7, 2023
Double kraken2 announcement: (a) new load of indexes at https://t.co/7RZ7nbJHye, and (b) new version v2.1.3 with bug fixes and more efficient masker https://t.co/tyqpwSwvrp. Importantly, the release also includes ... (1/2)
— Ben Langmead (@BenLangmead) June 7, 2023submitted by: Istvan Albert
Want to get the Biostar Herald in your email? Who wouldn't? Sign up righ'ere: toggle subscription