How to get Affymetrix probe annotation include only specific genes?
1
0
Entering edit mode
3.9 years ago

Dear all,

I have followed everything you explained here, I appreciate for such a very well explanation. However, I have a problem with Affymetrix probes. The Affymetrix Human Genome U133Plus 2.0 arrays, was used for GSE17536 dataset and I realized that some of protein coding genes are lost due to the lack of Affymetrix id. Those genes which will be used in the downstream analysis are of our interest and I don't know how to extract those genes without Affymetrix ids. Below there is an example

            affy_hg_u133_plus_2       ensembl_gene_id     gene_biotype        external_gene_name
       83750                          ENSG00000254415     protein_coding           SIGLEC14

I would appreciate if you could advise me any strategy to reach out those genes I am interested in.

Many thanks in advance.

Affymetrix probesets biomart GSE17536 • 784 views
ADD COMMENT
0
Entering edit mode
3.9 years ago

Let's say you have your list of genes-of-interest into a file called genes.txt, either Ensembl or HGNC.

Let's say you have your probes in a text file called probes.txt.

You could do the following to filter your list of probes:

$ cat <(head -1 probes.txt) <(grep -wFf genes.txt probes.txt) > probes.filtered.txt

The cat command concatenates the probe header with the list of probes containing a matching gene name.

Here's an explanation of what each of the three grep options do: https://explainshell.com/explain?cmd=grep+-wFf

ADD COMMENT

Login before adding your answer.

Traffic: 2024 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6