Entering edit mode
23 months ago
d
•
0
I would like to know the best approach to reproduce a research paper's findings (such as a cancer research paper) as a data scientist, in order to implement a machine learning prediction on the finding's data. Considering I will find different data from the research paper and reproduce the findings.
Thank you
I am not sure what you are asking. If you can obtain the data and then follow the analysis published in associated paper you should get some what of a "close" result assuming there were no major errors in analysis. Unless you had exact code for the workflow this may be the best you can do in terms of "reproducing" the analysis since some slight differences may always creep in.
That having said, in absence of a published containerized workflow (almost never the case) you have to read the method sections, start either from raw data at GEO (if it is NGS) or processed data they may provide, and then simply try to stick as close to what they did in the methods as you can.