As a simple method of associating genes to ChIP-seq binding regions I want to employ the UCSC canonical gene set, in order to avoid the inclusion of multiple transcripts per gene.
Using the UCSC table browser I am looking to extract the following information:
chr [ chrom from hg18.knownCanonical]
start [ chromStart from hg18.knownCanonical]
end [ chromEnd from hg18.knownCanonical]
strand [ strand from hg18.knownGene]
ucsc_id [ name from hg18.knownGene]
gene symbol [ best place? at present geneName from hg18.refFlat]
I can get these items by selecting the items from the tables (shown between [ and ]).
I can get gene symbol from the Refseq related tables, but this restricts symbols to the refseq set of genes (possibly not a bad thing...).
How does the UCSC browser link a gene symbol (e.g. SYN3) to the knownGene track, instead of displaying uc003amx.1?
This is currently a human question.
Thanks!
You are welcome :)