6
Entering edit mode
11 months ago
Rob 6.9k

It's kind of a "negative" result, but I'll add this one to the mix, as I think it highlights the importance of being careful when doing large-scale computational analyses with complex genomics data:

ADD COMMENT
1
Entering edit mode

Following up on the above; many of the authors of this paper have now just published a new pre-print

Comprehensive analysis of microbial content in whole-genome sequencing samples from The Cancer Genome Atlas project

which looks broadly across WGS data in the TCGA. From their abstract, they find:

Our recent re-analysis of data from three cancer types revealed that technical errors have caused erroneous reports of numerous microbial species reportedly found in sequencing data from The Cancer Genome Atlas (TCGA) project. Here we have expanded our analysis to cover all 5,734 whole-genome sequencing (WGS) data sets currently available from The Cancer Genome Atlas (TCGA) project, covering 25 distinct types of cancer. We analyzed the microbial content using updated computational methods and databases, and compared our results to those from two major recent studies that focused on bacteria, viruses, and fungi in cancer. Our results expand upon and reinforce our recent findings, which showed that the presence of microbes is far smaller than had been previously reported, and that most species identified in TCGA data are either not present at all, or are known contaminants rather than microbes residing within tumors.

ADD REPLY
1
Entering edit mode

Just saw on X, original paper that that paper replied to was retracted.

ADD REPLY
1
Entering edit mode

Indeed! It looks like the retraction statement is still not live yet, so it’s unclear if the authors agree with the retraction or if this is Nature’s unilateral decision. Regardless, one wonders of the implications for the follow-up study.

ADD REPLY
4
Entering edit mode
11 months ago
Dave Carlson ★ 2.0k

Here are a couple of possibilities that immediately came to my mind:

ADD COMMENT
4
Entering edit mode
11 months ago
dsull ★ 6.9k

In addition to what others have already posted:

The long-read assembly papers have been pretty influential (or rather, will be very influential in the years to come):

  • "The complete sequence of a human Y chromosome"
  • "Telomere-to-telomere assembly of diploid chromosomes with Verkko"
  • "The complete sequence of a human genome"
  • "Resolution of structural variation in diverse mouse genomes reveals chromatin remodeling due to transposable elements"
    • Many others...

For non-consortium papers that I'd say are influential (in part, basing them on social media response):

There's "The specious art of single-cell genomics" (Plos comp bio) by my colleague, which has ignited some discussion+debates+considerations about t-SNE/UMAPs.

There's "Major data analysis errors invalidate cancer microbiome findings" (Mbio), which has performed some important re-analysis of a major finding and revealed how important it is to normalize correctly and to make absolutely sure your reads are aligning to what you think they are aligning to. Processing genomics data (at both the read-level and quantification-level) is very difficult to get "right" so be extra wary of your own papers and of papers by others when drawing large biological conclusions from one genomics data analysis.

  • Edit: Someone else beat me to this as I was typing :)
ADD COMMENT

Login before adding your answer.

Traffic: 1079 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6