biomarkers identification by data mining of public genomic tumour databases in R or Python
1
0
Entering edit mode
7.4 years ago
rahel14350 ▴ 40

Dear All, I am new on data mining and I would like to do it in R. I did read the documents on http://www.rdatamining.com/. There is quite informative information but I am not sure if I could use the same packages for mining genes/Proteins/mutations from bio-medical data bases (TCGA, COSMIC ...). Do anyone has such an experience in R and which packages can be used? Many thanks in advance, Rahel

Data mining R TCGA biomarker python • 2.4k views
ADD COMMENT
1
Entering edit mode

Could you refine your question. Data mining is a large field.

ADD REPLY
0
Entering edit mode

Dear Nicolas, The main goal is to identify new biomarkers by data mining of public genomic tumour databases. and in this stage I want to know, how I can get started to import the databases in R (only genes or only mutations ...) and start to perform different data mining algorithms on a set of data ... I hope I am clear this time ... Thanks

ADD REPLY
0
Entering edit mode
7.4 years ago
Kevin Chu • 0

If you want to analyze the TCGA data, I would recommend the TCGA-biolinks, which is a Bioconductor package in R. Here is the newest version https://github.com/BioinformaticsFMRP/TCGAbiolinks. You can refer to the tutorials to learn how to extract the different levels of information in normal tissues and tumors, including mutations, gene expression, gene methylation and so on.

As for the cosmic database, you can download the data from http://cancer.sanger.ac.uk/cosmic/download, the majority of data was mutation-related. It is just the tab format file, and you can simply import it to R and perform the furthermore analysis.

ADD COMMENT
0
Entering edit mode

Dear Kevin, Many thanks for your reply. I am going to check them ...

ADD REPLY

Login before adding your answer.

Traffic: 2002 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6