Hi everyone,
I am wondering if there is a standard/consensus about how we should define promoter regions which may fall upon gene body regions of another gene. To expand: say I define a promoter as 1000 bp away from the TSS and I want to intersect CpG methylation to find regulatory regions (gene body, promoter etc...). Now, let's assume that in one case the site falls within a gene body region of gene "x", however, it may also fall within the promoter regions of nearby genes, in this case, gene "y". (I mention one case, but in reality, we can observe several cases of this across mammalian genomes).
Given that there is a direct intersection between the site and the gene body of a gene "x", should we exclude this site from falling within the promoter region of the nearby gene "y"? My guess, is that there is no real "right" or "wrong" here, but I am wondering if there is a standard that the community generally follows? In the end, I am sure, that only a lot of validation to directly identify that there is a regulatory mechanism between this site and gene "x" and/or "y" would actually tell us something, but again, I am looking to see if there is a general standard we should follow for bioinformatics workflows of annotation.
To complement, attached is an image which shows this example from UCSC. For the sake of a visual, assume that the site falls within the red rectangle area I highlighted. Thus in this case, this site falls directly within the intron of SPATA1
, but also in the upstream region we define may define as the promoter of GNG5
(where in this particular case, these two genes overlap a bit, but there are many others where they may not directly overlap, but the promoter regions still would).
What do you all think?
Image from specific example below:
There's no community standard here, though in most cases the region you've marked in red would be kept as part of the promoter.
From what I understand, in addition to Devon's comment, is that these situations are quite common across the genome (and variations of these situations). I imagine that the 2 genes in the diagram would be expressed in different tissues and/or under different cellular states, i.e., as the chromatin responds to stimuli and opens/closes in a certain fashion such that transcription of one over the other is favoured. If both were transcribed at the same time, some form of polymerase 'blockage' could occur, or something along those lines...
Very true Kevin, thank you for that additional thought and thank you Devon.