Hi, I am doing GO in R
I downloaded this annotation
annot = read.csv(file = "HuGene-1_0-st.csv", header = T);
dim(annot)
probes = names(datExpr)
> head(probes)
[1] "MKL2" "MAST2" "KAT5" "WWC2" "UBE2Z" "PHYHIPL"
probes2annot = match(probes, annot$transcript_cluster_id)
Gives me all NA
sumis.na(probes2annot))
Should return 0 but returns 7243
What I am doing wrong?
> head(annot)
probeset_id seqname strand start stop probe_count
1 7896739 chr1 + 63033 63649 31
2 7896741 chr1 + 69109 70008 24
3 7896743 chr1 + 334144 334272 6
4 7896745 chr1 + 367693 368597 36
5 7896747 chr1 + 564951 565019 28
6 7896751 chr1 + 568069 568136 28
transcript_cluster_id exon_id psr_id
1 7896738 96595544 97686467
2 7896740 96595546 97686470
3 7896742 96595548 97686473
4 7896744 96595550 97686476
5 7896746 96595552 97686479
6 7896750 96595556 97686485
gene_assignment
1 ENST00000492842 // OR4G11P
2 BC136848 // OR4F17 /// NM_001005240 // OR4F17 /// NM_001004195 // OR4F4 /// ENST00000318050 // OR4F17
3 ---
4 NM_001005277 // OR4F16 /// NM_001005221 // OR4F29 /// NM_001005504 // OR4F21 /// ENST00000456475 // OR4F29 /// ENST00000456475 // OR4F16 /// ENST00000456475 // OR4F3
5 ---
6 ---
mrna_assignment
1 ENST00000492842 // chr1 // 100 // 31 // 31 // 0
2 BC136848 // chr1 // 100 // 24 // 24 // 0 /// NM_001005240 // chr1 // 100 // 24 // 24 // 0 /// NM_001004195 // chr1 // 100 // 24 // 24 // 0 /// ENST00000318050 // chr1 // 100 // 24 // 24 // 0
3 ENST00000455207 // chr1 // 100 // 6 // 6 // 0 /// TCONS_l2_00002387-XLOC_l2_000726 // chr1 // 100 // 6 // 6 // 0 /// TCONS_l2_00002388-XLOC_l2_000726 // chr1 // 100 // 6 // 6 // 0
4 NM_001005277 // chr1 // 100 // 36 // 36 // 0 /// NM_001005221 // chr1 // 100 // 36 // 36 // 0 /// NM_001005504 // chr1 // 89 // 32 // 36 // 0 /// ENST00000456475 // chr1 // 100 // 36 // 36 // 0
5 AK074482 // chr1 // 79 // 22 // 28 // 0
6 NC_001807 // chr1 // 100 // 24 // 24 // 0
crosshyb_type number_independent_probes number_cross_hyb_probes
1 3 0 0
2 3 0 0
3 3 0 0
4 3 0 0
5 3 0 0
6 3 0 0
number_nonoverlapping_probes level bounded noBoundedEvidence
1 4 --- 0 0
2 7 --- 0 0
3 0 --- 0 0
4 6 --- 0 0
5 0 --- 0 0
6 0 --- 0 0
has_cds fl mrna est vegaGene vegaPseudoGene ensGene sgpGene
1 0 0 0 0 0 0 1 0
2 0 1 0 0 0 0 1 0
3 0 0 0 0 0 0 1 0
4 0 3 0 0 0 0 1 0
5 0 0 0 0 0 0 1 0
6 0 0 0 0 0 0 1 0
exoniphy twinscan geneid genscan genscanSubopt mouse_fl
1 0 0 0 0 0 0
2 0 0 0 0 0 0
3 0 0 0 0 0 0
4 0 0 0 0 0 0
5 0 0 0 0 0 0
6 0 0 0 0 0 0
mouse_mrna rat_fl rat_mrna microRNAregistry rnaGene mitomap
1 0 0 0 0 0 0
2 0 0 0 0 0 0
3 0 0 0 0 0 0
4 0 0 0 0 0 0
5 0 0 0 0 0 0
6 0 0 0 0 0 0
probeset_type
1 main
2 main
3 main
4 main
5 main
6 main
>
Thank you my data is on
GPL16791 Illumina HiSeq 2500 (Homo sapiens)
I also tried gene assignment by your suggestion that gives NA