how to identify uniq genes between two gff files.
2
0
Entering edit mode
14 months ago
nikhil ▴ 20

I have two GFF files of the same species obtained from different annotation methods, and I want to identify unique genes by comparing both GFF files.

Thank you

Genes Uniq Annotation GFF • 1.3k views
ADD COMMENT
0
Entering edit mode
ADD REPLY
0
Entering edit mode
14 months ago

extract genes (e.g: awk '($3=="gene") {printf("%s\t%d\t%s\n",$1;int($4)-1,$5);}' ) from both files n sort, and use bedtools intersect with option -f .

ADD COMMENT
0
Entering edit mode

hi Pierre Linderbaum,

after extracting genes, I have 5846 genes in afile.bed and 4456 genes in bfile.bed , while using bedtools intersect -a afile.bed -b bfile.bed -wa -wb -f 0.50 > genes.bed

I'm getting a count of 5800 genes, can you explain this ??

thank you

ADD REPLY
0
Entering edit mode

can you explain this ??

yes. you should use options like -u .

Write original A entry once if any overlaps found in B. In other words, just report the fact at least one overlap was found in B. Restricted by -f and -r.

ADD REPLY
0
Entering edit mode

yes got it, thank you very much.

ADD REPLY
0
Entering edit mode
14 months ago
Juke34 8.9k

You can run agat_sp_compare_two_annotations.pl from AGAT, you should get the information you are looking for. This method allows to catch gene nested in introns of other genes.

ADD COMMENT
1
Entering edit mode

yes Juke34 , I tried agat_sp_compare_two_annotations.pl for both gff files and got comparative output in table format.

Thank you very much.

ADD REPLY
0
Entering edit mode

Do you get a similar result as found with bedtools?

ADD REPLY
0
Entering edit mode

yes Juke34 , both results are almost similar.

ADD REPLY

Login before adding your answer.

Traffic: 2389 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6