Entering edit mode
4.3 years ago
Alewa
▴
170
Hi all,
i download few ensensemble genes with cordinates (Human genes (GRCh38.p13)) from biomart and want to convert to ucsc hg38 cordinates. there are some contigs not in hg38 ref dictionary
see this example (full file below); CHR_HSCHR5_1_CTG1_1 70376388 70411404 ENSG00000276910 GTF2H2
i didn't see the option on hgliftOver https://genome.ucsc.edu/cgi-bin/hgLiftOver
any suggestions on how to accomplish this?
thanks Sam
5 87318416 87412930 CCNH ENSG00000134480 q14.3
19 4090321 4124122 MAP2K2 ENSG00000126934 p13.3
17 28346633 28357527 POLDIP2 ENSG00000004142 q11.2
CHR_HSCHR6_MHC_MANN_CTG1 30952827 30958749 ENSG00000233149 GTF2H4
15 80152490 80186946 FAH ENSG00000103876 q25.1
2 216107464 216206303 XRCC5 ENSG00000079246 q35
1 241497603 241519755 FH ENSG00000091483 q43
5 157142933 157255185 ITK ENSG00000113263 q33.3
6 111299028 111483715 REV3L ENSG00000009413 q21
10 70597348 70602759 PRF1 ENSG00000180644 q22.1
CHR_HSCHR5_1_CTG1_1 70376388 70411404 ENSG00000276910 GTF2H2
Are these genes specifically in patches? If so they may not have been present in original hg38 release. Reference sequence (as released by GRC) used by all annotators should be identical. CCNH gene is in UCSC
thanks @genomax for the explanation. yes it seems these genes are also in the patches. what is usually done in the community? these are list of DNA repair genes which i want to the intersect with my
.VCFs
any suggestions on how to go about this correctly?Sorry if I was not clear. These genes do not seem to be ONLY in patches i.e. they were present in original assembly release. UCSC and Ensembl may annotate genes differently. For example UCSC has many entries for CCNH gene (and one of those overlaps the one you have in your list though the stop nucleotide in UCSC is different)