Question

Affymetrix Human Genome U133 Plus 2.0 Array sequence analysis

0

Entering edit mode

5.9 years ago

sujasubramanian ▴ 70

Hello i am working with Affymetrix Human Genome U133 Plus 2.0 gene expression dataset, i want to analyse each gene sequence wise, is there any option to collect the sequences corresponding to each probe id .

sequence • 3.5k views

ADD COMMENT • link 5.9 years ago by sujasubramanian ▴ 70

0

Entering edit mode

can anybody help me to find a good journal for publishing my research work, i am running out of time .Bioinformatics (SCI indexed) journal is preferable.

ADD REPLY • link 5.9 years ago by sujasubramanian ▴ 70

0

Entering edit mode

How is this related to this thread, which is about the Affymetrix Human Genome U133 Plus 2.0 Array probe sequences?

ADD REPLY • link 5.9 years ago by Kevin Blighe 88k

Kevin Blighe · Answer 1 · 2018-12-17

0

Entering edit mode

5.9 years ago

Kevin Blighe 88k

You can get the sequence of each probe in tabular or FASTA format from the Affymetrix support site: Human Genome U133 Plus 2.0 Array - Support Materials.

I give a previous example for another Affymetrix array, here: Sequence corresponding to each probe id of affymetrix gene chip

You can also get the coding / mRNA sequence of the gene targeted by the probe in R via biomaRt. I give an example of this, here: A: getting gene sequence from probeset id

I believe there are other ways to do this, too.

Kevin

ADD COMMENT • link 5.7 years ago by Kevin Blighe 88k

0

Entering edit mode

sorry by mistake i post this here

ADD REPLY • link 5.9 years ago by sujasubramanian ▴ 70

0

Entering edit mode

hello, i have used the support materials from affymetrix support site, instead of using refseq/ensembl i am using alignment column vales for my studies , so i can get multiple probeid for each gene i considered each differently with this alignment infn, but i come across a problem related with some genes for eg gene DDX11L2 residing in chr2, according to alignment it is in both chr1 and chr2, how can i interpret the change, please help me :

223777_at   **DDX11L2** chr1:13680-14407 (+) // 91.71 // p36.33 /// chr16:63361-64088 (+) // 91.84 // p13.3 /// chr9:13793-14520 (+) // 91.58 // p24.3 /// chr15:102516762-102517489 (-) // 91.71 // q26.3 /// chr2:114356610-114357337 (-) // 92.73 // q13

ADD REPLY • link updated 5.8 years ago by Kevin Blighe 88k • written 5.8 years ago by sujasubramanian ▴ 70

0

Entering edit mode

These 'DDX11L' genes are pseudogenes, so, the probes will map to multiple regions of the genome. Normally, you will have absolutely no interest in data from these genes.

ADD REPLY • link 5.8 years ago by Kevin Blighe 88k

0

Entering edit mode

from where i get the infn about pseudogenes, is that ok i reject them from my analysis

ADD REPLY • link 5.8 years ago by sujasubramanian ▴ 70

0

Entering edit mode

I just knew from experience that the 'DDX' genes have known pseudogenes. I would expect that information on gene biotype (i.e., lincRNA, protinc coding, pseudogene, etc) is stored within the other Affymetrix annotation files, no? If not, you could generate a list of genes and their biotypes using the data from GENCODE: https://www.gencodegenes.org/human/ (you will need the Comprehensive gene annotation).

I would not exclude these from the initial data processing of your microarray. Which commands have you used? Have you used oligo package?

Does that help?

ADD REPLY • link 5.8 years ago by Kevin Blighe 88k

0

Entering edit mode

biocLite('affy')
library(affy)
ensembl=useMart("ensembl")
ensembl=useDataset("hsapiens_gene_ensembl",mart=ensembl) 
MyBioinfData<-ReadAffy()
eset.rma<-rma(MyBioinfData)

affy package is used for normalisation, and get the eset (expression set data ), i directly link this expression values corresponding to each probe id with there gene name from affymetrix support site, no filtering at all

ADD REPLY • link updated 5.8 years ago by Kevin Blighe 88k • written 5.8 years ago by sujasubramanian ▴ 70

0

Entering edit mode

That generally looks fine. I presume that you are using ensembl for a good reason? How was the box-and-whisker plot of the normalised, transformed data?

ADD REPLY • link 5.8 years ago by Kevin Blighe 88k

0

Entering edit mode

IS THERE IS ANY WAY TO PRODUCE MAS5 S/W RESULT WITH BOOLEAN LOGIC, PRESENCE 'p', ABSENCE 'A', LIKE. I AM USING TWO DATA SET ONE IS .CEL FILES I NORMALIZED WITH RMA, ANOTHER ONE GIVE EXPRESSION VALUES FROM MAS5 S/W INTENSITY VALUES, WITH BOOLEAN VALUES

ADD REPLY • link 5.7 years ago by sujasubramanian ▴ 70

0

Entering edit mode

You left Caps Lock on.

ADD REPLY • link 5.7 years ago by Kevin Blighe 88k

0

Entering edit mode

Where have you seen this: MAS5 S/W RESULT WITH BOOLEAN LOGIC ?

ADD REPLY • link 5.7 years ago by Kevin Blighe 88k