Hello, I really don't understand why the same gene has a different number of transcripts, but moreover, a different start position. In this case for example
entrezgene hgnc_symbol ensembl_gene_id chromosome_name
2934 ENSG00000283430 9
2934 ENSG00000283299 9
2934 GSN ENSG00000148180 9
start_position end_position strand gene_biotype
121300103 121332603 1 protein_coding
121282452 121332603 1 protein_coding
121207794 121332843 1 protein_coding
percentage_gc_content transcript_count
49.64 1
50.09 1
46.87 15
I downloaded all the HUMAN genes with an entrezgene ID using biomaRt, but now I am confused as to why this happes; I've got another couple of examples (but not than many) of multiple start positions for the same gene ID :(
thanks!
Right, actually if I download the transcription start site and transcript biotype it becomes evident. Thank you! I really liked the part of "The concept of gene is a bit an annoying definition", it was helpful to remember so.