Hi
I'm sorry for the basic question but I am confused about the UCSC table browser and would really appreciate some clarification.
I believe the UCSC table browser is based on an underlying genomic database which can be considered to be conceptually analagous to the, say, the ensemlbl database in that it contains tables of information about genomic features.
I believe tables that have positional information can be displayed as tracks on the genome browser. These are the main tables. I believe that other tables that contain descriptive information about the main tables are auxilliary tables. Both types of table can be downloaded as text files from the browser but information in auxillary tables is obtained by linking from a main table.
I don't understand how the database handles basic biological models such as the fact a gene has many exons which can be arranged in different combinations to give different transcripts. For example in the KnownGenes tables i was able to find a list of exons for a gene but I didn't know how to find the transcripts for a gene. The best I could find was a table called all_mrna which seemed to link back to the refseq gene table by a field called qname. But i couldn't tell from looking at the mRNA which of the gene exons were in the mRNA.
But if you wanted to download the sequence of the known genes table you can get just the coding exons so how does the browser know which exons are transcribed?
I don't understand how this is done. How would i use the table browser to get table data for all coding exons in a format such as chr, start, end? An example of an sql query to achieve this against the underlying database would be much appreciated
thanks a lot
Can anyone provide the sql query?