Entering edit mode
2.2 years ago
bioinfo2345
▴
40
I have a genome with the following two features:
- Genome coordinates for a gene taken from BLAST (1-based): 320790 to 320341.
- Genome coordinates for an intergenic region from bedtools (0-based): 326373 to 339263.
I know how to find the size of the gene in a 1-based system: 320790-320341+1 = 450 bp. I know how to find the size of the intergenic region in a 0-based system: 339263-326373 = 12890 bp.
But how far away is the gene from the intergenic region? How to find the distance between two features where one is 0-based and the other 1-based?
My naive strategy is to convert the 0-based bedtools coordinates to 1-based coordinates by taking start-1 as the new start:
- Genome coordinates for an intergenic region from bedtools (0-based) but converted to 1-based: 326372 to 339263.
Then I do a 1-based subtraction: 326372-320790+1 = 5583 bp.
Two questions:
- Is this correct?
- Does the method I should use depend on if the gene is ahead or behind the intergenic region of interest?
Hi! Maybe I am missing the point, but in order to know the spanning distance from one feature (gene) to the other (intergenic region), is not as simple as substracting the start position of the most downstream feature minus the end position of the most upstream feature?
You should first, as you mentioned, convert both annotations to the same coordination system (0-based or 1-based, as you prefer).
If you use a 1-based system, do you not need to add +1 to such a subtraction for it to make sense?
Let us take a toy example: 1-GTCTCG-6 (1-based)
If the first G is part of the gene and the last gene is part of the intergenic region, 6-1 = 5 but the distance is really 4 nucleotides?
Or does this depend on if you look for the inclusive or exclusive distance?