problem with bedtools complement: how to extract strand information?
2
0
Entering edit mode
10.1 years ago
biolab ★ 1.4k

Hi everyone,

I have a minor trouble when using bedtools complement. My gff3 file contains Chr start end strand etc. information. I want to use bedtools complement to extract other regions, however, the command bedtools complement -i gff_file -g genome_file gives me three columns: Chr, start, end. The output does not have strand information.

Thank you very much

bedtools • 3.3k views
ADD COMMENT
1
Entering edit mode

This tool would give you the regions of your genome_file which are not in your gff_file. What strand you expect from output?

ADD REPLY
1
Entering edit mode
10.1 years ago
iraun 6.2k

You can not get additional information about the complement regions of your interest regions. The main goal of executing bedtools complement command should be to get all off-target regions of your genome, not to "annotate" those regions.

ADD COMMENT
0
Entering edit mode

Thanks a lot, Manu and airan, I understand the command only outputs region range information.

ADD REPLY
0
Entering edit mode
4.1 years ago

Taking the complement of both strands separately worked for me.

grep -v '+$' sorted_gff_file | bedtools complement -i stdin -g genome_file | sed 's/$/\t-/' > complement_file
grep '+$' sorted_gff_file | bedtools complement -i stdin -g genome_file | sed 's/$/\t+/' >> complement_file

You can sort the complement file after this.

ADD COMMENT

Login before adding your answer.

Traffic: 2586 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6