Hello! everyone.
Recently, I downloaded chicken "refseq gene" file and "gap" file in UCSC, but I found some gaps overlapping refseq genes, even locating in genes, so what happend in this situation and why? Wheather we can discard those genes overlapping gaps or not? I can't search similar question about this, so I need your help. Thanks a lot!
A example is showed as follows:
refseq gene
#bin name chrom strand txStart txEnd
74 NM_001199320 chr1 - 1371140 1821501
gaps
chrom chromStart chromEnd size
chr1 1581252 1614100 32848
chr1 1615830 1637787 21957
I don't know about the chicken sequence, but, theoretically, "gap" can be anywhere indicating that the underlying sequence is not known for some reasons and should be filled with Ns. Edit: I think you can check this by looking at FASTA at the gap's positions.
you can also align the sequences from refseq to the chicken genome to see the alignment.