I am seeking data sets for prostate/cervical cancer. I'm investigating how close synthetic data (genAI) can approximate field data.
In the case of prostate cancer I'm looking for:
Age | PSA | Gleason | Stage | Treatment | Side Effects | Outcome | |
---|---|---|---|---|---|---|---|
65 | 10.5 | 7 | II | Radiation | Fatigue | Stable | |
--- | --- | --- | --- | --- | --- | --- | --- |
I have investigated several repos for the data and numerous papers for example. The National Cancer Institute (first hyperlink) requires a formal request and these can take a long time and it's not clear whether the resulting data is appropriate. The data set of the paper isn't available but the code is available (I am not sure Elsevier demand it, Nature and PLoS journals do). In theory I can write to the authors and then the editor if that fails. Again this takes a long time but it depends on journal rules
Again, if anyone does have experience with this data set type, it would be really helpful, because even if I have to submit a formal request I can be confident the data will be what is sought.
Cross-posted here: https://bioinformatics.stackexchange.com/questions/23013/data-sets-for-prostate-cervical-cancer