Sorry about the simplicity of the question! But I'm trying for ours to open a gff in artemis to check some gene models and I can not!! The fasta and sorted bam file work well, but I can not get the gff together at all!!
So, I tried to sort and index it with tabix, but its not working!
Another method to sort gff is to use "genometools" and run
gt gff3 -sortlines input.gff > output.gff
This command also has a useful -tidy option that can clean up and validate very messy GFFs. It will rename the ID column though, unless -retainids is used
This is similar to the method from http://www.htslib.org/doc/tabix.html but it properly sets the tab delimiter on the sort command and avoids subshells
Hey guys!
Sorry about the simplicity of the question! But I'm trying for ours to open a gff in artemis to check some gene models and I can not!! The fasta and sorted bam file work well, but I can not get the gff together at all!!
So, I tried to sort and index it with tabix, but its not working!
The sort with the line bellow seems to go fine
(grep ^"#" in.gff; grep -v ^"#" in.gff | sort -k1,1 -k4,4n) | bgzip > sorted.gff.gz
But then, When I try
tabix -p sorted.gff.gz
It does not work! It says:
[E::get_intv] failed to parse TBX_GENERIC, was wrong -p [type] used? The offending line was: "itr6_2569_ AUGUSTUS CDS 1416 1422 . + 0 ID=itr6_2569_pilon_pilon_pilon_pi.g1.t1.cds;Parent=itr6_2569_pilon_pilon_pilon_pi.g1.t1" Segmentation fault (core dumped)
My gff is a PASA update output:
file: # original
itr6_6049_ AUGUSTUS gene 18202 46612 . - . ID=itr6_6049_pi.g765;Name=itr6_6049_pi.g765.t1
itr6_6049_ AUGUSTUS mRNA 18202 46612 . - . ID=itr6_6049_pi.g765.t1;Parent=itr6_6049_pi.g765;Name=itr6_6049_ pi.g765.t1
itr6_6049_ AUGUSTUS exon 46565 46612 . - . ID=itr6_6049_pi.g765.t1.exon1;Parent=itr6_6049_pi.g765.t1
I would appreciate any help.. This is really annoying me because its seems simples, I know =(
Thank you!
this is a new question , not a reply. Open a new question please
Oh sorry, Pierre! I'll do that!
You should use "tabix -p gff filename.gff.gz" not just "tabix -p filename.gff.gz"