Question

gene expression data

0

Entering edit mode

7.4 years ago

lilly ▴ 10

Hi I am trying to perform classification for distinguishing cancer from controls in gene expression data. I am finding it difficult as the datasets are very large. And how do i remove the repeated gene symbol so that i can select the specific attributes(gene selection)?

genome • 1.1k views

ADD COMMENT • link updated 7.4 years ago by akshayb04 ▴ 30 • written 7.4 years ago by lilly ▴ 10

0

Entering edit mode

This question (as written) lacks sufficient detail to generate useful answers (see this for guidance: How To Ask Good Questions On Technical And Scientific Forums ). You need to clarify what these datasets are from and what kind of analysis you are doing on them.

ADD REPLY • link 7.4 years ago by GenoMax 147k

score 0 · Answer 1 · 2017-07-24

Hello,

Please can you inform what software application you use to open raw/processed expression? In general, I directly read the files in UNIX or command line.

In regard to your second question. Uniprot provides direct gene symbol mapping. After providing the input, the web tool also provides you with a non-redundant list. If you wish to do it manually, then you can do it in Excel or any basic coding language, but make sure you consider "space" as a character.

I hope this advice helps.!