I need to index a GTF (gene transfer format) annotation file
1
0
Entering edit mode
9 weeks ago
doodle • 0

I created a GTF file for HLA alleles to be used as a resource for GATK Funcotator. Running Funcotator without indexing the GTF gives this error:

A USER ERROR has occurred: Input funcotator_dataSources.v1.7.20200521s/gencode/hla/hla.annotation.gtf must support random access to enable queries by interval. If it's a file, please index it using the bundled tool IndexFeatureFile

The first few lines of GTF file:

hla_a_01_01_01_01   IMGHLA  gene    1   3503    .   +   .   gene_id "hla_a_01_01_01_01"; gene_name "hla_a_01_01_01_01"; source "IMGHLA";
hla_a_01_01_01_01   IMGHLA  transcript  1   3503    .   +   .   gene_id "hla_a_01_01_01_01"; transcript_id "hla_a_01_01_01_01.1"; gene_name "hla_a_01_01_01_01"; transcript_name "hla_a_01_01_01_01.1";
hla_a_01_01_01_01   IMGHLA  exon    301 373 .   +   .   gene_id "hla_a_01_01_01_01"; transcript_id "hla_a_01_01_01_01.1"; gene_name "hla_a_01_01_01_01"; transcript_name "hla_a_01_01_01_01.1"; exon_number "1"; exon_id "hla_a_01_01_01_01_e_1";
hla_a_01_01_01_01   IMGHLA  exon    504 773 .   +   .   gene_id "hla_a_01_01_01_01"; transcript_id "hla_a_01_01_01_01.1"; gene_name "hla_a_01_01_01_01"; transcript_name "hla_a_01_01_01_01.1"; exon_number "2"; exon_id "hla_a_01_01_01_01_e_2";
hla_a_01_01_01_01   IMGHLA  exon    1015    1290    .   +   .   gene_id "hla_a_01_01_01_01"; transcript_id "hla_a_01_01_01_01.1"; gene_name "hla_a_01_01_01_01"; transcript_name "hla_a_01_01_01_01.1"; exon_number "3"; exon_id "hla_a_01_01_01_01_e_3";

I need to index this file before running Funcotator. I tried using the GATK IndexFeatureFile, as suggested by Funcotator but it gives this error:

A USER ERROR has occurred: Unknown file is malformed: Decoded feature is not valid: hla_a_01_01_01_01   IMGHLA  gene    1   3503    .   +   .   gene_id "hla_a_01_01_01_01"; gene_name "hla_a_01_01_01_01";  source "IMGHLA";
hla_a_01_01_01_01   IMGHLA  transcript  1   3503    .   +   .   gene_id "hla_a_01_01_01_01"; transcript_id "hla_a_01_01_01_01.1"; gene_name "hla_a_01_01_01_01"; transcript_name "hla_a_01_01_01_01.1";
hla_a_01_01_01_01   IMGHLA  exon    301 373 .   +   .   gene_id "hla_a_01_01_01_01"; transcript_id "hla_a_01_01_01_01.1"; gene_name "hla_a_01_01_01_01"; transcript_name "hla_a_01_01_01_01.1"; exon_number 1; exon_id "hla_a_01_01_01_01_e_1";
hla_a_01_01_01_01   IMGHLA  exon    504 773 .   +   .   gene_id "hla_a_01_01_01_01"; transcript_id "hla_a_01_01_01_01.1"; gene_name "hla_a_01_01_01_01"; transcript_name "hla_a_01_01_01_01.1"; exon_number 2; exon_id "hla_a_01_01_01_01_e_2";
hla_a_01_01_01_01   IMGHLA  exon    1015    1290    .   +   .   gene_id "hla_a_01_01_01_01"; transcript_id "hla_a_01_01_01_01.1"; gene_name "hla_a_01_01_01_01"; transcript_name "hla_a_01_01_01_01.1"; exon_number 3; exon_id "hla_a_01_01_01_01_e_3";

Can someone suggest a solution for this or an alternate tool to index the GTF file?

GATK Funcotator GTF HLA-typing • 510 views
ADD COMMENT
0
Entering edit mode
ADD REPLY

Login before adding your answer.

Traffic: 2317 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6