Entering edit mode
4.4 years ago
rohitsatyam102
▴
920
Hello Everyone!!
I have a file containing genomic coordinates of enhancers and their associated gene coordinate. I am tasked to check if the coordinates fall within TAD boundaries. For that purpose, I have decided to use Dixon et al TAD matrix. However, I haven't ever done Hi-C analysis so I am confused about how to achieve my objective.
The TAD matrix contains the same coordinates in both the columns
## chr1:770137-1250137 chr1 770137-1250137 *
## chr1:1250137-1850140 chr1 1250137-1850140 *
## chr1:1850140-2330140 chr1 1850140-2330140 *
## chr1:2330140-3650140 chr1 2330140-3650140 *
## chr1:4660140-6077413 chr1 4660140-6077413 *
## chr1:6077413-6277413 chr1 6077413-6277413 *
How to interpret this? Any kind of help with be appreciated.
Hi, I plan to use the Bedtools, however, I don't understand why the same coordinates are repeated twice in the file. Any idea?
I dont see any coordinates repeated from your example.
I have a file that contains columns with same coordinate as shown below
Why they are repeated twice? What does it mean? Also, I don't understand why the coordinates in the columns are continuous i.e. the end of each coordinate is the start of the next coordinate.
Okay so I found some explanation on the UCSC genome browser here:
But what does
Start position of lower region
andEnd position of upper region
meansIf you understand what are TADs, you will understand why most of them are continuously represented. If there exactly same coords repeated, you can remove duplicates.