minimap2 alignment for maker
1
0
Entering edit mode
3.8 years ago
gilsorek12 ▴ 10

I used minimap2 to align a de novo transcriptome file to a reference genome.

With samtools I converted the minimap2 output to bed and wrote my own script to create the gff which will be provided to maker est_gff

According to this maker-devel topic: https://groups.google.com/g/maker-devel/c/2j9NWwl-4xY The alignment gff from minimap2 needs to follow the alignment format used by GFF3 (i.e. match/match part)

I run three sample tests of maker to check the gff I created from minimap2:

  1. I include protein sequences (fasta) and mRNA sequences without transcriptome to use as a reference for test no. 2
  2. I Include both protein and mRNA sequences and provided the est_gff that was created from minimap2
  3. I Include all sequences (proteins, mRNA, transcriptome) and let maker use BLAST for all alignments.

When I compared the final gff files from tests 1 & 2 the results were identical. I checked the presence of est_gff input in test 2 and the file did contain alignments from minimap2:

scaffold15014-5 est_gff:minimap2    expressed_sequence_match    275244  275456  1000    +   .   ID=scaffold15014-5:hit:10067:3.12.0.2;Name=TRINITY_DN110156_c0_g2_i1;score=1000
scaffold15014-5 est_gff:minimap2    match_part  275244  275456  1000    +   .   ID=scaffold15014-5:hsp:16084:3.12.0.2;Parent=scaffold15014-5:hit:10067:3.12.0.2;Target=TRINITY_DN110156_c0_g2_i1 1 213 +;Gap=M213

I think it means that maker did not reject the format I provided, but for some reason he did not use it to provide the hints based annotation predictions. The minimap2 gff I provided to maker est_gff looks like:

scaffold15014-5 minimap2    expressed_sequence_match    103440  103740  1000    +   .   ID=scaffold15014-5:TRINITY_DN55863_c2_g1_i1:hit:103440-103740;Name=TRINITY_DN55863_c2_g1_i1
scaffold15014-5 minimap2    match_part  103440  103595  1000    +   .   ID=scaffold15014-5:TRINITY_DN55863_c2_g1_i1:hit:103440-103740:hsp:1;Parent=scaffold15014-5:TRINITY_DN55863_c2_g1_i1:hit:103440-103740;Target=TRINITY_DN55863_c2_g1_i1 1 156;
scaffold15014-5 minimap2    match_part  103635  103740  1000    +   .   ID=scaffold15014-5:TRINITY_DN55863_c2_g1_i1:hit:103440-103740:hsp:2;Parent=scaffold15014-5:TRINITY_DN55863_c2_g1_i1:hit:103440-103740;Target=TRINITY_DN55863_c2_g1_i1 157 262;

Thanks for consideration and help.

maker minimap2 annotation alignment • 1.6k views
ADD COMMENT
0
Entering edit mode
3.8 years ago
Juke34 8.9k

Your GFF format is wrong there are problems in the parent/ID relationships. I advise you to use agat_convert_minimap2_bam2gff.pl from AGAT to create your gff file.

ADD COMMENT
0
Entering edit mode

Thank you! The results were much better using AGAT.

ADD REPLY

Login before adding your answer.

Traffic: 2006 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6