Hi,
I have a gff file only with gene feature in it.
NZ_CP009361.1 RefSeq gene 544 1905 . + . ID=gene0;Name=KQ76_RS00005;gbkey=Gene;gene_biotype=protein_coding;locus_tag=KQ76_RS00005;old_locus_tag=KQ76_00005
NZ_CP009361.1 RefSeq gene 2183 3316 . + . ID=gene1;Name=KQ76_RS00010;gbkey=Gene;gene_biotype=protein_coding;locus_tag=KQ76_RS00010;old_locus_tag=KQ76_00010
I want to extract intergenic coordinate with the strand infromation. Output something like,
NZ_CP009361.1 1906 2182 + KQ76_RS00005-KQ76_RS00010
I have written a code which extract it. i could not get a way to handle strand when one gene is on + strand and another is not - strand. Is there a tool/way through which i can get the desire result.
Note: I just want the coordinates, not sequences
Thanks
I think the output format has a problem with requiring strand information, but gaps don't have a direction, they are just gaps. A gap doesn't become more of a '+ gap' just because the neighbors are on +. Therefore, I don't see a biological meaningful way of assigning a strand to a gap, if that was what you were asking.
I wanted the strand for intergenic region, just to know the neighboring genes are from same strand or from different strand
Do you want to do operon prediction? It is possible to extract the information of the gap neighbors after the gaps have been computed and then give it a '+', '-', or '.' depending on the strand of the enclosing genes. See
help('nearest-methods')
and use methodsprecede
andfollow
to get the flanking genes for each gap, then iterate and check if strand precede == strand follow.thank you for the suggestions. I will go through