How I can curate all exon start-end positions for 50 genes (to create GTF file)?
0
0
Entering edit mode
1 day ago
Esra • 0

Hi, I'm trying to create a GTF file to use in BionanoAccess platform. Normally when I create BED file, I can't see exonic positions, I can only see whole gene position. When I check hg38 genes.gtf file, I saw that exons are visible in that format. So I decided to create a gtf file for my genes, including exon positions. How can I easily curate exonic start-end positions for each gene? Because I don't want to do it manually, gene by gene. Thank you in advance.

gtf curation gene • 203 views
ADD COMMENT
1
Entering edit mode
ADD REPLY
0
Entering edit mode

Thanks for your reply. First of all, I’m sorry if my question is unclear—I’m still a beginner in working with these file formats. From what I understand, GTF and GFF file formats are different, and because of that, I couldn’t directly convert my BED file (which only contains gene start-end positions) into a GTF file that includes exon start-end positions. I’m not sure how to retrieve all exon start-end positions efficiently. I checked Ensembl and NCBI RefSeq but couldn’t find an easy way to extract this information for my genes. Then, I explored UCSC Table Browser, which seems more useful than the others, but it’s still quite complicated for me.Could you provide more details on how AGAT can help with this specific task? Or is there another straightforward way to curate exon positions for multiple genes? Thank you in advance

ADD REPLY
0
Entering edit mode

I checked Ensembl and NCBI RefSeq but couldn’t find an easy way to extract this information for my genes.

If you are trying to select annotation for a subset of genes from one of these GTF files then you could use a different tool in AGAT: https://agat.readthedocs.io/en/latest/tools/agat_sp_extract_attributes.html

As @LauferVA noted, there will be more than one isoform for many of the genes so you will need to keep that in mind as you subset. If you need only one transcript entry per gene then using MANE select may be the best option.

ADD REPLY
0
Entering edit mode

First, "genes" as such do not really have start and end positions, transcripts do.

Im not trying to be pedantic or annoying, this is true and sort of important.

To answer your question, you must first select a transcript isoform for each gene.

If you dont understand this, please try to do some background reading or post here for guidance... if you pick like "mane select" for all of them, or something, the question becomes a lot easier to answer.

ADD REPLY

Login before adding your answer.

Traffic: 1896 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6