Entering edit mode
4.7 years ago
mel22
▴
100
Hi I have SNP data from a DNA array , with GeneOontology tool on R , I selected SNP on DNA repair genes. How this tool defines genes ? Does it take + XX kb to the sequence by default ?
Thanks for your help
There isn't enough information to answer your question. What data do you have? Which tools did you use? What's your workflow? At a high level, this could be something like mapping a SNP to a genomic region, finding the gene IDs associated with the SNP region then recovering the GO terms associated with the corresponding gene IDs. The details of how this is done in practice depend on the data and tools you use.
Thanks Jean-Karim, I used BioMart package in R to select a list of genes involved in DNA repair based on GO informations. Then with this list of genes names + coordinates I extracted SNPs in this genes. The SNPs are extracted from illumina chip. The questions was if the genes coordiantes according to GO takes extra kb added to the gene sequence ?
Thanks
GO doesn't have any sequence information. The GO terms are associated with gene IDs. Wherever this ID is coming from is the source of the gene definition, for example a gene ID from Ensembl means that the gene is defined by Ensembl. So you can consider the GO term associated with this Ensembl ID to be associated with the whole gene sequence as defined by Ensembl. A potential issue is that the source of the gene definition and of the SNP do not use the same reference genome and/or genome annotations, i.e are your SNPs genomic coordinates on the same reference genome used to define your genes?
Ah ok I see, thank you very much. Yes both (genes and variants) are based on Grch 37