Entering edit mode
6.1 years ago
lordnurr
•
0
I'm a computer science student, and I'm working on a project to make a model to identify the probability for someone to get contracted with Type 2 Diabetes Melitus by using Convolutional Neural Network.Thus I need the DNA sequence data set of a person with Type 2 Diabetes Melitus for data training. As I don't quite have a good grip on this subject I've been having trouble looking for the data set, where can i get the data set?, thanks.
That's too vague! Please reframe what do you want
How To Ask Good Questions On Technical And Scientific Forums
What is the context of this project? I don't see that the aim is realistic given the resources. positive training set of size 1 (to estimate probabilities), no negative training set, 3Gbp of sequence in the human genome mostly unrelated to any disease.
I think that seems to be a common trend to just brute force apply machine learning to whatever type of problem without domain knowledge. I'd go and talk to your supervisor to first make a realistic project plan, including what type of training and test data is available, and then potentially choose and easier topic. This also includes understanding the state of the art, and risk factors, a large proportion of which are non-genetic, search for GWAS (genome wide association study) and type 2 diabetes.