Entering edit mode
8.2 years ago
pavep
•
0
How would I find the intergenic regions on a bed file using an awk script and separating the strands by + and -?
How would I find the intergenic regions on a bed file using an awk script and separating the strands by + and -?
1) You need to know the size of each chromosome to find all intergenic regions.
2) Do you have to limit yourself to awk? It is smarter to make use of available tools, like BEDTools:
bedtools complement -i genes.bed -g genome.txt > intergenic.bed
where genome.txt looks something like this:
chr1 249250621
chr2 243199373
...
Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Does this take care of overlapping genes?