The Biostar Herald publishes user submitted links of bioinformatics relevance. It aims to provide a summary of interesting and relevant information you may have missed. You too can submit links here.
This edition of the Herald was brought to you by contribution from Istvan Albert, and was edited by Istvan Albert,
New details on the @PacBio SBB short read sequencing platform presented by Jonas Korlach at @agbt, including a video demo of the instrument. This is a big moment for @PacBio! pic.twitter.com/Q4i48NV9nY
— Daniel Portik (@DPortik) June 8, 2022
New details on the @PacBio SBB short read sequencing platform presented by Jonas Korlach at @agbt, including a video demo of the instrument. This is a big moment for @PacBio! pic.twitter.com/Q4i48NV9nY
— Daniel Portik (@DPortik) June 8, 2022Slide materials can be found here:
https://investor.pacificbiosciences.com/static-files/db88e307-b81f-49d7-bdd4-6649ee9a1cf0
submitted by: Istvan Albert
Using the T2T reference genome (CHM13)vs GRCv38 improves alignment and detection of structural variants for @nanopore and @PacBio data from @nygenome data #AGBT22 pic.twitter.com/8EwKslZ5OW
— Chris Mason (@mason_lab) June 9, 2022
Using the T2T reference genome (CHM13)vs GRCv38 improves alignment and detection of structural variants for @nanopore and @PacBio data from @nygenome data #AGBT22 pic.twitter.com/8EwKslZ5OW
— Chris Mason (@mason_lab) June 9, 2022submitted by: Istvan Albert
My odyssey of obtaining scRNAseq metadata | DNA confesses Data speak (divingintogeneticsandgenomics.rbind.io)
[...] That’s how much it takes to get the sample level metadata from a public dataset. It is hard to fully automate and one has to go to GEO and supplementary tables, and wrangle the data to a desired format. If you ask me, I will tell you this is the real-life bioinformatics.
submitted by: Istvan Albert
How to read PCA plots — What do you mean "heterogeneity"? (www.nxn.se)
I have noticed some general patterns across datasets and studies. These I have seen either in papers or presentations or by analyzing our own or public data. Sketches of these patterns are shown on the right. I thought it would be useful to list out potential causes for these patterns. I'll do this here by simulating data to generate them.
submitted by: Istvan Albert
Gene Name Error Monthly Dashboard (ziemann-lab.net)
Gene name errors result when data are imported improperly into MS Excel and other spreadsheet programs (Zeeberg et al, 2004). Certain gene names like MARCH3, SEPT2 and DEC1 are converted into date format. These errors are surprisingly common in supplementary data files in the field of genomics (Ziemann et al, 2016). This could be considered a small error because it only affects a small number of genes, however it is symptomtic of poor data processing methods. The purpose of this script is to identify gene name errors present in supplementary files of PubMed Central articles in the previous month
submitted by: Istvan Albert
"data available upon request" pic.twitter.com/Mh8ctMbdUQ
— Jeremy Leipzig (@jermdemo) April 25, 2019
"data available upon request" pic.twitter.com/Mh8ctMbdUQ
— Jeremy Leipzig (@jermdemo) April 25, 2019submitted by: Istvan Albert
ScienceDirect (www.sciencedirect.com)
Even when authors indicate in their manuscript that they will share data upon request, the compliance rate is the same as for authors who do not provide DAS, suggesting that DAS may not be sufficient to ensure data sharing.
submitted by: Istvan Albert
Want to get the Biostar Herald in your email? Who wouldn't? Sign up righ'ere: toggle subscription