replace gene names in gff file
1
0
Entering edit mode
5.0 years ago

I have gff file generated by braker. It gives the default gene name like the following:

# start gene g1
CC151   AUGUSTUS    gene    5487 15014 0.36 -   .   g1
CC151   AUGUSTUS    transcript  5487 15014 0.36 -   .   g1.t1
CC151   AUGUSTUS    terminal    5487 5697 1 -   1   transcript_id "g1.t1"; gene_id "g1";
CC151   AUGUSTUS    intron  6385 6467 1 -   2   transcript_id "g1.t1"; gene_id "g1";
CC151   AUGUSTUS    intron  6550 6622 1 -   0   transcript_id "g1.t1"; gene_id "g1";
CC151   AUGUSTUS    intron  6714 6854 1 -   0   transcript_id "g1.t1"; gene_id "g1";
CC151   AUGUSTUS    CDS 6998 7110 1 -   2   transcript_id "g1.t1"; gene_id "g1";
CC151   AUGUSTUS    CDS 7888 7941 1 -   0   transcript_id "g1.t1"; gene_id "g1";
# end gene g1

I wish to change the default g1 to custom name for example "CC151_gene1". I tried to create a list of all gene ids and corresponding replace texts and tried the following:

g1   CC151_gene1
g2   CC151_gene2

grep -f gene.replacement.txt mygfffile.gff > replaced.gfffile.gff

However, my original file was not modified. Can anyone suggest a better method ?

Thanks in advance.

annotation gff augustus genome gene • 3.2k views
ADD COMMENT
1
Entering edit mode
5.0 years ago
Juke34 8.9k

I would suggest to use agat_sp_manage_IDs.pl from AGAT. In same time it will standardize your output file which is not correct (9th column should contain ’tag value’ attribute and it is not the case for gene and transcript)

ADD COMMENT
0
Entering edit mode

Hi Juke, Thanks for suggesting AGAT. It does work, but the problem is the naming is too long. For instance, the total number of genes i have is 28540 but I get a gene name like M000000000001. this is close to 12 places.

ADD REPLY
0
Entering edit mode

Ya I implemented like that to follow what does Ensembl. What you could do now that your file is standardized by AGAT, is to use agat_sq_manage_ID.pl (Do not use this script with your original file because it expects a properly formatted gff3. All script with sq prefix need a proper gff3 file )

ADD REPLY
0
Entering edit mode

I have fixed it in AGAT version 0.1.0

ADD REPLY

Login before adding your answer.

Traffic: 1657 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6