I'm looking for either a database (like SRA) or even a study that provides its data that has labels associated with the data. Ideally, this would be metagenomic data (either sequences or abundance tables) in a study that has a strong link between a feature like a species and the condition being studied.
Just reaching out because I haven't been able to find any studies that have enough data for my application (implementing machine learning algorithms) - so ideally we are talking about at least 100 samples for the condition being studied (controls, maybe the same).
Any help is appreciated. Thanks guys
thanks! I'll look through that post... of course I'm probably being a little too nitpicky with my search... we'll never really find ideal data in the real world, will we?