Entering edit mode
6.9 years ago
kirannbishwa01
★
1.6k
I am trying to lift over a GFF, GTF using the Chain file generated from VCF. I am using CrossMap
as well as liftOver
from UCSC.
LiftOver isn't working. CrossMap is giving me an output but it indicates that there are failures. Upon looking at those position in the Chain
and original GFF
file, I cannot see any pattern to why the Fail - message
is rising.
Using liftOver
:
$ /home/priyanka/liftOver/liftOver short-GFF.gff BcfOut/MA625-left.chain BcfOut/MA625-left02.gff -errorHelp
Deleted in new:
Sequence intersects no chains
Partially deleted in new:
Sequence insufficiently intersects one chain
Split in new:
Sequence insufficiently intersects multiple chains
Duplicated in new:
Sequence sufficiently intersects multiple chains
Boundary problem:
Missing start or end base in an exon
Using CrossMap
:
python /home/priyanka/CrossMap-0.2.7/bin/CrossMap.py gff BcfOut/MA625-left.chain short-GFF.gff > BcfOut/MA625-left.gff
Observed Output:
1 version-2 gene 1 2541 0.43 - . ID=AL1G10010;Name=AL1G10010;Note=Protein_Coding_gene fail (multpile match to target assembly)
1 version-2 transcript 1 2541 0.43 - . ID=AL1G10010.t1;Parent=AL1G10010 fail (multpile match to target assembly)
1 version-2 intron 62 144 0.96 - . ID=AL1G10010.t1;Parent=AL1G10010 -> 1 version-2 intron 62 144 0.96 - . ID=AL1G10010.t1;Parent=AL1G10010
1 version-2 intron 253 406 1 - . ID=AL1G10010.t1;Parent=AL1G10010 -> 1 version-2 intron 253 406 1 - . ID=AL1G10010.t1;Parent=AL1G10010
1 version-2 intron 783 1422 1 - . ID=AL1G10010.t1;Parent=AL1G10010 fail (multpile match to target assembly)
1 version-2 intron 1643 1802 1 - . ID=AL1G10010.t1;Parent=AL1G10010 -> 1 version-2 intron 1641 1800 1 - . ID=AL1G10010.t1;Parent=AL1G10010
1 version-2 intron 2036 2123 1 - . ID=AL1G10010.t1;Parent=AL1G10010 -> 1 version-2 intron 2034 2121 1 - . ID=AL1G10010.t1;Parent=AL1G10010
1 version-2 intron 2348 2443 1 - . ID=AL1G10010.t1;Parent=AL1G10010 -> 1 version-2 intron 2346 2441 1 - . ID=AL1G10010.t1;Parent=AL1G10010
1 version-2 CDS 1 61 0.75 - 0 ID=AL1G10010.t1;Parent=AL1G10010 -> 1 version-2 CDS 1 61 0.75 - 0 ID=AL1G10010.t1;Parent=AL1G10010
1 version-2 exon 1 61 . - . ID=AL1G10010.t1;Parent=AL1G10010 -> 1 version-2 exon 1 61 . - . ID=AL1G10010.t1;Parent=AL1G10010
Any reason why is this failing ?