Entering edit mode
4.6 years ago
blazer9131
▴
20
Hi All,
I have a list of 4000 genes (Transcript IDs/HUGO symbols) which I want to annotate with the following canonical information:
1 1190583 1190867 - UBE2J2_NM_194315_exon8_544-828_182-276
1 1191425 1191505 - UBE2J2_NM_194315_exon7_463-543_155-181
1 1192372 1192510 - UBE2J2_NM_194315_exon6_324-462_108-154
1 1192588 1192690 - UBE2J2_NM_194315_exon5_221-323_74-108
1 1198726 1198766 - UBE2J2_NM_194315_exon4_180-220_60-74
1 1200163 1200210 - UBE2J2_NM_194315_exon3_132-179_44-60
1 1203242 1203372 - UBE2J2_NM_194315_exon2_1-131_1-44
I have tried downloading various formats of the refFlat database from UCSC (hg19), but I can't seem to get a format similar to this, or any way to convert their formats because I can't find the exon and codon/AA information(the 1-131 and 1-44 part from exon 2 for example) . I have all the exon start/stop info, and exon #, but not the other data.
Any help would be amazing.
Thank you!
Hello blazer9131!
It appears that your post has been cross-posted to another site: https://bioinformatics.stackexchange.com/questions/13026/how-to-get-refflat-file-which-includes-exon-codon-lengths-example-in-post
This is typically not recommended as it runs the risk of annoying people in both communities.
Ah, I thought it would be nice to reachout to both communities to see if I was able to get an answer from one/other.
Will keep that in mind for the future :)