Hi, I was wondering if anyone know how to merge 2 or more gff files to make a consensus gff file? I have gtf file from tophat/cufflinks pipeline gff file from velvet assembly and i would like to merge these two to the already annotated gff file. The idea is to have one gff file instead of three gff files so that i can load this as track on my Genome Browser.
Thanks in advance
No particular reason..I want to all three files separately along with a fourth file indicating the consensus annotation which users might feel more convenient dealing with.
One way to do this is to start by converting them to sorted UCSC BED with BEDOPS gtf2bed (link) and sort-bed (link). Here's a quick way to convert a bunch of files if you use a bash shell:
$ for i in `ls *.gtf`; \
do gtf2bed < $i > $i.converted.bed; \
done;
Then do a multiset union set operation with BEDOPS bedops (link) to make a single BED file called answer.bed that can be loaded into your genome browser instance:
I have done genome guided assembly using StringTie and it also generated a gff file. I also have already reported CDS gff. I want to merge both gff and want to extract consensus sequences from genome.
Any reason to not simply load three separate files? Alternatively, you can simply concatenate them and sort by chromosome location.
No particular reason..I want to all three files separately along with a fourth file indicating the consensus annotation which users might feel more convenient dealing with.
Programs such as MAKER combine evidence for gene models