I frequently download datasets (mostly ChIP) from published papers for use in my own work. I just as frequently find datasets that are of exceptionally poor quality. For example, 2 million ChIP-seq reads across the mouse genome. Very low read quality and coverage, poor peak calling etc. These data have been used in "high impact" publications to justify conclusions with wide ranging implications.
- Has anyone else ever encountered these problems?
- Do peer reviewers ever check next-gen sequencing results?
- What steps can I take to try to correct the literature?
Any advice or comments would be appreciated.
It would interesting to collect some of these publications (the higher "impact factor" the better) and write a paper on this phenomenon.
A related post on how a lot of the data deposited for the Ebola project does not contain data for the virus: How to find the mapping percentages for data deposited in the Zaire ebolavirus bioproject from the 2014 outbreak
Also to take into account: there's a lot of Mycoplasma contamination going on
I guess it depends for several things. Maybe the sequencing data is not the super best but still only 1 in dozens of other experiments all agreeing and if it makes a good discussion and relevant discovery still can be well published.
Like this paper for example: http://www.nature.com/nature/journal/v523/n7559/full/nature14452.html
They pooled the samples and make DE analysis using monoclates and with low sequencing coverage. Maybe they only wanted to spare a lot of money and it was enough to have this data for their purpose
I guess it is a case by case issue.
Sure, but what if it's one single experiment with no other findings to support that particular conclusion? And the sequencing data is not only not the super best, but well and truly terrible?
It'll depend on the reviewers and the story then. If you can make an interesting story then the quality and reliability of your data doesn't matter much (have a look at papers in Nature and Science, many are great, many are complete crap but have a good (and likely wrong) story).
I guess it is unlikely to archive a good journal.
These are in very good journals. If the reviewers aren't checking, no one knows.