Are there any alternatives to Liftoff - Mapping annotations (GFF/GTF) between assemblies
3
1
Entering edit mode
3.2 years ago
VenGeno ▴ 100

Hi,

I am annotating closely related accession (varieties) using reference assembly (please note that I am using only a region, so that is the reason why you don't see chromosome info). I really liked liftoff (ver 1.6.1: bioconda installation); however, it only picked up gene features from the reference GFF3 file. Reference GFF3 file contains many features, including CDS, mRNA, etc. Command I used liftoff -g SALTOL_Nip.gff -o BG94_1.gff3 BG94_1.fasta SALTOL_Nip.fasta My original GFF3 file looks like follows;

##gff-version 3
##source-version geneious 2021.2.2
##sequence-region   SALTOL  1   5800001
SALTOL  Geneious    source  1   5800001 .   +   .   Name=source Oryza sativa Japonica Group
SALTOL  Geneious    misc_feature    1   1   .   +   .   Name=misc
SALTOL  Geneious    misc_feature    2   100001  .   +   .   Name=misc
SALTOL  Geneious    misc_feature    100002  200001  .   +   .   Name=misc
SALTOL  Geneious    gene    1   1349    .   +   .   Name=Os01g0293800 gene
SALTOL  Geneious    gene    114140  118531  .   -   .   Name=Os01g0295900 gene
SALTOL  Geneious    gene    105528  108078  .   -   .   Name=Os01g0295700 gene
SALTOL  Geneious    gene    102152  104528  .   -   .   Name=Os01g0295600 gene

and output I am getting is;

Bg_94-1_CX35|chr01_10700000_16500000    Liftoff gene    1   1345    .   +   .   ID=gene_1;Name=Os01g0293800 gene;coverage=0.997;sequence_ID=0.982;extra_copy_number=0;copy_num_ID=gene_1_0
Bg_94-1_CX35|chr01_10700000_16500000    Liftoff gene    1623    3128    .   -   .   ID=gene_6;Name=Os01g0293900 gene;coverage=0.999;sequence_ID=0.968;extra_copy_number=0;copy_num_ID=gene_6_0
Bg_94-1_CX35|chr01_10700000_16500000    Liftoff gene    20379   21605   .   -   .   ID=gene_7;Name=Os01g0294500 gene;coverage=0.999;sequence_ID=0.995;extra_copy_number=0;copy_num_ID=gene_7_0
Bg_94-1_CX35|chr01_10700000_16500000    Liftoff gene    48673   50214   .   -   .   ID=gene_5;Name=Os01g0294700 gene;coverage=1.0;sequence_ID=0.995;extra_copy_number=0;copy_num_ID=gene_5_0
Bg_94-1_CX35|chr01_10700000_16500000    Liftoff gene    102125  104501  .   -   .   ID=gene_4;Name=Os01g0295600 gene;coverage=1.0;sequence_ID=0.992;extra_copy_number=0;copy_num_ID=gene_4_0

I raised the issue at GitHub a few days back and still waiting for an answer. Are there any other tools you folks recommend for this purpose? Thanks in advance!

annotations mapping GTF GFF3 • 3.9k views
ADD COMMENT
1
Entering edit mode

Looking at the liftoff documentation:

By default, 'gene' features and all child features of genes (i.e. trancripts, mRNA, exons, CDS, UTRs) will be lifted over

it seems you should get what you expect. I wonder: Is your original GFF correctly formatted? In particular, are the children of a gene correctly linked to their parents?

ADD REPLY
1
Entering edit mode
3.2 years ago

You could try the original tool that inspired it: liftOver https://genome.ucsc.edu/cgi-bin/hgLiftOver

ADD COMMENT
1
Entering edit mode
3.2 years ago
Juke34 8.9k

Be carefull Liftoff is is not an annotation tool!
Actually they added in version 1.6 a Polishing Exon/CDS option so it is fine now!

See here https://github.com/agshumate/Liftoff/issues/7 In that thread I talk about other alternatives.
MAKER is also a nice one to perform liftover annotation.

What you might do using AGAT is to copy the information from the 3rd column into an attribute (9th column), and then modify the 3rd column to use only features accepted by LiftOff. Once the liftover done, you can past back the info into the 3rd column.

ADD COMMENT
0
Entering edit mode

Hi Jacques,

Small clarification. You mean to use agat_sp_manage_attributes.pl from AGAT right. Just making sure. Thank you 🙏

ADD REPLY
2
Entering edit mode

I think it is not possible with the script you mention.

It will be a bit more complex. Something like that: Use agat_convert_sp_gff2tsv.pl to create a tsv. Copy the ID column along with the 3rd column from the ouput tsv file into a new tsv that will be used as attribute input. Add the header ID original_feature_type Then use agat_sq_add_attributes_from_tsv.pl to add in your original gff file the information from the attribute input tsv file.

ADD REPLY
0
Entering edit mode

Thank you, Jacques! I will do that.

ADD REPLY
1
Entering edit mode
3.2 years ago
Juke34 8.9k

I found the graal of liftover tools:

nf-LO: A Scalable, Containerized Workflow for Genome-to-Genome Lift Over

Edit GenoMax (Jan 2024): Original link provided by Juke34 https://bmcgenomics.biomedcentral.com/articles/10.1186/s12864-020-6569-1 is pointing to a different paper. Replaced with correct link.

ADD COMMENT
1
Entering edit mode
ADD REPLY
0
Entering edit mode

GenoMax Thank you!

ADD REPLY
0
Entering edit mode

Thanks, Jacques! I will check it out. I got the issue resolved. Got a little busy and I will post it here as well.

ADD REPLY
0
Entering edit mode

Hi VenGeno,

When you use Liftoff with the option -mm2_options ="-a --end-bonus 5 --eqx -N 50 -p 0.9", do you get an error? I get

ERROR: failed to open file '=-a': No such file or directory

I have tried installing using conda as well as pip. both times I get the same error. The author seems to be unresponsive. Did you or anyone else face the same problem? If so how to fix it? I am using v1.6.3 the latest one.

I find the defaults of the -mm2_options hardcoded in run_liftoff.py. But I cannot change the same or find where the code is breaking. So reaching out to a wider audience.

Thanks
Abhijit

ADD REPLY

Login before adding your answer.

Traffic: 2507 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6