Hello,
I have a GFF3 file. How can I get the intergenic regions and its positive and negative chain information?
Thank.
Hello,
I have a GFF3 file. How can I get the intergenic regions and its positive and negative chain information?
Thank.
short answer: parse the regions between all the 'gene' features from the GFF3 file.
Asking the strand of an intergenic (I assume that's what you're asking?) is pointless: strandness is a property for 'genic' regions (proteins, rna genes, motifs,...) , it has no meaning in intergenic context.
You may use from the bedtools subtractBed: Get the chromosome-sizes as a bedfile and subtract the annotation. The only thing you might need to do is to convert your gff3 file into gff2 or bed format.
As lieven.sterck wrote, strandedness is meaningless. The neighboured features could be both on the same strand, or on different ones; but what would be the implied information?
Take a look at my recent post on how to get things like introns and intergenic regions from a given GFF file.
None of these tools or approaches will get you the strand of an intergenic ... as said several times before, that does not make sense!!
you CLIP seq will have a strand yes, and that will remain the same! it just indicates on which strand the CLIP-seq maps.
if you know which CLIP-seq maps in intergenic and you know on which strand the CLIP-seq data maps, you can simply combine those two.
Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Yes,you are right. I am asking the strand of an intergenic. Ihave the CLIP-seq reads with strand, I want to know which reads fall in the intergenic region.So I need to know the strand of intergenic.
yes, your CLIP-seq can (will) have a specific strand and that's OK, but the intergenic can not have a strand! The fact whether the CLIP-seq is in the intergenic is strand-independant!
Use bedtools to intersect the intergenic with the CLIP-seq locations and for those that are present in the intergenic you can check to which strand the CLIP-seq data aligns.
What you are talking is , I use bedtools complement first to extract the region of the intergenic region (do not consider the positive and negative strand?). And then use bedtools intersect to pick out the overlap of CLIP-seq reads and the intergenic region? Thanks.