Here's my data:
sample_A: Canonical assembly with gene models (sample_A.fasta, sample_A.gff3)
sample_B: Mutant and de-novo assembly. No gene models (sample_B.fasta)
I want to transfer the gene models from sample_A to sample_B.
I thought this would be straightforward but it's definitely not. There are some instances where exon_2 comes before exon_1 or where a particular exon maps multiple times on the de-novo assembly.
Is there a tool that will do this? Ideally, I would like a tool that does the following:
program --ref_assembly sample_A.fasta --ref_annotations sample_A.gff3 --query_assembly sample_B.fasta --percent_identity 0.98 > sample_B.gff3
Here is an example of a unique edge case when I've mapped the exons from transcript FUN_000463-T1(from sample_A.gff3 and sample_A.fasta) to the new assembly (sample_B.fasta). Notice the exon ordering:
Here's the left side zoomed in:
Here's the right side zoomed in:
Notice the exon ordering.
You can try RATT. Success will depend on quality of your assemblies.
Thank you. I'm looking at it right now and it's pretty confusing to run. https://vcru.wisc.edu/simonlab/bioinformatics/programs/ratt/Documentation.html I installed with conda but it appears a lot of the files aren't there. I also found this tutorial: http://avrilomics.blogspot.com/2013/02/using-ratt-to-transfer-gene-predictions.html
Do you know of any other tools for this? I've heard of liftover but there is little documentation on using with a new organism.
I've updated my question a bit to be more specific.