Need help with AUGUSTUS and AGAT for gene prediction and feature extraction, respectively.
1
0
Entering edit mode
12 weeks ago
Vijith ▴ 90

I have two queries: one is regarding AUGUSTUS, and the second is about extracting sequences from the *.gff file for downstream BLASTx homology-based annotation.

  1. I ran AUGUSTUS using the command
    augustus [parameters] --species=SPECIES queryfilename > output.gff

without specifically setting --alternatives-from-sampling to True. Will this affect the completeness of the downstream annotation process?

  1. After completing the de novo gene prediction, I want to run BLASTx for homology-based annotation, alongside the evidence-based approach. However, I’m confused about extracting features from the *.gff output. In a previous faulty AUGUSTUS run, I used AGAT's **agat_sp_extract_sequences.pl** to extract sequences with the command.
agat_sp_extract_sequences.pl -g infile.gff -f infile.fasta -t gene

However, after reading more about its suitability for BLASTx, I realized this approach might include introns, UTRs, intergenic regions, etc. Therefore, I’m considering using the alternative command:

agat_sp_extract_sequences.pl -g infile.gff -f infile.fasta --mrna

or

agat_sp_extract_sequences.pl -g infile.gff -f infile.fasta -t cds.

If you want details on what these commands extract, please take a look at this image.

augustus AGAT annotation genome agat • 360 views
ADD COMMENT
2
Entering edit mode
12 weeks ago
Juke34 9.0k

Since blastx translates the query sequence in all six reading frames to blast against a protein database, the logic would push you to extract only what is supposed to be translated i.e: CDS.

So use the last command e.e:

agat_sp_extract_sequences.pl -g infile.gff -f infile.fasta -t cds
ADD COMMENT
0
Entering edit mode

Juke34, thank you so much for your valuable response. This helps.

ADD REPLY

Login before adding your answer.

Traffic: 2161 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6