I'll use the latest gencode GTF for humans as an example.
curl ftp://ftp.ebi.ac.uk/pub/databases/gencode/Gencode_human/release_36/gencode.v36.annotation.gtf.gz | \
gunzip > human_gencode_36.gtf
You can import the GTF file into R using rtracklayer::import
and keep only the data from the GTF you want.
library("tidyverse")
library("rtracklayer")
gtf <- import("human_gencode_36.gtf") %>%
as_tibble %>%
distinct(gene_id, gene_name, gene_type)
> gtf
# A tibble: 60,660 x 3
gene_id gene_type gene_name
<chr> <chr> <chr>
1 ENSG00000223972.5 transcribed_unprocessed_pseudogene DDX11L1
2 ENSG00000227232.5 unprocessed_pseudogene WASH7P
3 ENSG00000278267.1 miRNA MIR6859-1
4 ENSG00000243485.5 lncRNA MIR1302-2HG
5 ENSG00000284332.1 miRNA MIR1302-2
6 ENSG00000237613.2 lncRNA FAM138A
7 ENSG00000268020.3 unprocessed_pseudogene OR4G4P
8 ENSG00000240361.2 transcribed_unprocessed_pseudogene OR4G11P
9 ENSG00000186092.6 protein_coding OR4F5
10 ENSG00000238009.6 lncRNA AL627309.1
# … with 60,650 more rows
Let's say that you have a vector of gene_ids that you wanted to get the information for.
genes <- sample(gtf$gene_id, 5)
> genes
[1] "ENSG00000287105.1" "ENSG00000254060.1" "ENSG00000271538.6"
[4] "ENSG00000148399.13" "ENSG00000234648.1"
You can simply filter the imported data using this vector.
> filter(gtf, gene_id %in% genes)
# A tibble: 5 x 3
gene_id gene_type gene_name
<chr> <chr> <chr>
1 ENSG00000271538.6 lncRNA LINC02427
2 ENSG00000287105.1 lncRNA AC090577.1
3 ENSG00000254060.1 lncRNA AC022778.1
4 ENSG00000148399.13 protein_coding DPH7
5 ENSG00000234648.1 processed_pseudogene AL162151.2
Since you have not provided any example ID's I can't check but I suggest taking a look at RNACentral.
Thanks a lot!! Ill check it out :)