Extracting START and STOP codon position from Augustus GFF
0
0
Entering edit mode
5.1 years ago
mhfk2901 ▴ 20

Hello everyone. This is my first time posting questions here. I am new to the field of Bioinformatics and I don't really know how to use command lines well. I am trying to identify the position of START and STOP codon from my AUGUSTUS prediction but any attempt to do so using grep has been a failure so far. I did go through related questions but I don't really understand the command lines given.

Example of my gff output

# Predicted genes for sequence number 73 on both strands
# start gene g3
254 AUGUSTUS    gene    1   491 0.98    -   .   g3
254 AUGUSTUS    transcript  1   491 0.98    -   .   g3.t1
254 AUGUSTUS    intron  109 168 1   -   .   transcript_id "g3.t1"; gene_id "g3";
254 AUGUSTUS    CDS 1   108 0.98    -   1   transcript_id "g3.t1"; gene_id "g3";
254 AUGUSTUS    CDS 169 491 1   -   0   transcript_id "g3.t1"; gene_id "g3";
254 AUGUSTUS    start_codon 489 491 .   -   0   transcript_id "g3.t1"; gene_id "g3";
# protein sequence = [MSSRSLAALAVVGAVALCARSASASGVTSDTSGIAGQTYDYIVVGAGLAGTTVAARLAENSAISILLIEAGGDDRGNS
# QVYDIYEYAQAFNGPLDWAWQSDRGKVLHGGKTLGGSSSINGGHWTRGLNAQYDAMSSLLEDSEQ]
# end gene g3
###
#

If I can just extract the line 254 AUGUSTUS start_codon 489 491 . - 0 transcript_id "g3.t1"; gene_id "g3"; , that would be good enough for me.

augustus grep gff gff3 gene prediction • 2.2k views
ADD COMMENT
1
Entering edit mode

you can simply use grep command to extract the line.

grep "start_codon" file
ADD REPLY
1
Entering edit mode

Yes, Prakash. I just realized that it was so easy to do so. Thank you!

ADD REPLY
0
Entering edit mode

This problem has been solved. Thank you

ADD REPLY

Login before adding your answer.

Traffic: 2482 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6