The Biostar Herald publishes user submitted links of bioinformatics relevance. It aims to provide a summary of interesting and relevant information you may have missed. You too can submit links here.
This edition of the Herald was brought to you by contribution from Mensur Dlakic, Istvan Albert, Rob, and was edited by Istvan Albert,
Comprehensive analysis of microbial content in whole-genome sequencing samples from The Cancer Genome Atlas project | bioRxiv (www.biorxiv.org)
Abstract:
In recent years, a growing number of publications have reported the presence of microbial species in human tumors and of mixtures of microbes that appear to be highly specific to different cancer types. Our recent re-analysis of data from three cancer types revealed that technical errors have caused erroneous reports of numerous microbial species reportedly found in sequencing data from The Cancer Genome Atlas (TCGA) project. Here we have expanded our analysis to cover all 5,734 whole-genome sequencing (WGS) data sets currently available from The Cancer Genome Atlas (TCGA) project, covering 25 distinct types of cancer. We analyzed the microbial content using updated computational methods and databases, and compared our results to those from two major recent studies that focused on bacteria, viruses, and fungi in cancer. Our results expand upon and reinforce our recent findings, which showed that the presence of microbes is far smaller than had been previously reported, and that most species identified in TCGA data are either not present at all, or are known contaminants rather than microbes residing within tumors. As part of this expanded analysis, and to help others avoid being misled by flawed data, we have released a dataset that contains detailed read counts for bacteria, viruses, archaea, and fungi detected in all 5,734 TCGA samples, which can serve as a public reference for future investigations.
submitted by: Rob
x.com (twitter.com)
submitted by: Istvan Albert
x.com (twitter.com)
submitted by: Istvan Albert
Developmental isoform diversity in the human neocortex informs neuropsychiatric risk mechanisms (www.science.org)
Abstract: We identified 214,516 distinct isoforms, of which 72.6% were novel (not previously annotated in Gencode v33), and >7000 novel exons, expanding the proteome by 92,422 putative proteoforms.
Submitter's note: I have attempted to obtain the sequence for these 214K unique isoforms, but I was unable to locate a FASTA or GFF file ...
submitted by: Istvan Albert
iSeq: An integrated tool to fetch public sequencing data | bioRxiv (www.biorxiv.org)
The authors claim that iSeq is currently the only tool that supports simultaneous retrieval from multiple databases (GSA, SRA, ENA, DDBJ, and GEO).
submitted by: Mensur Dlakic
Want to get the Biostar Herald in your email? Who wouldn't? Sign up righ'ere: toggle subscription