Question

Analysis over single chromosome data

1

Entering edit mode

8.7 years ago

mgvaldesgraterol ▴ 10

I have no expertise in biology, I'm a data scientist, and I would like to know if it makes sense, from the biological point of view, to analyze data (SNP data) coming from a single chromosome, and not all 22 chromosomes, to predict the risk of a certain disease.

Should I obligatorily use data from all chromosomes? Why?

Thank you very much. And sorry if it is a very basic question, but I really would like to understand this.

SNP analysis • 1.6k views

ADD COMMENT • link updated 8.7 years ago by Matteo Schiavinato ★ 3.7k • written 8.7 years ago by mgvaldesgraterol ▴ 10

score 0 · Answer 1 · 2017-01-25

0

Entering edit mode

8.7 years ago

Matteo Schiavinato ★ 3.7k

SNP data show single nucleotide polymorphisms. Therefore they don't depend on each other, one chromosome is ok as much as the whole number of chromosomes. Especially if you look for disease-involved SNPs, which are most likely on a restricted set of genes (or just one gene), which are probably in the chromosome you received the data of.

ADD COMMENT • link 8.7 years ago by Matteo Schiavinato ★ 3.7k

0

Entering edit mode

Well, I'm applying machine learning algorithms to the data given to predict a complex disease (the first dataset is of lung cancer and the second of type 2 diabetes). The biologists who gave me the data, delivered data from all 22 chromosomes. I've read in the past few days that both lung cancer and type 2 diabetes are complex diseases, and that they are affected by the mutation on several genes. Since there are several, they can be spread across any chromosome, right? Then, shouldn't I analyse the entire set of 22 chromosomes?

ADD REPLY • link 8.7 years ago by mgvaldesgraterol ▴ 10

0

Entering edit mode

Ok, I wasn't expecting the machine learning as method for SNP data. I thought you were working with sequencing reads aligned to a reference genome, finding variants. Can you be more precise on the type of data you are using then?

ADD REPLY • link 8.7 years ago by Matteo Schiavinato ★ 3.7k