ERROR: illegal character '.' when running bedtools closest command
0
0
Entering edit mode
7.6 years ago
Matus ▴ 70

Hello everyone,

I have experienced a problem when I was trying to find closest TSS to a peak by using this command:

bedtools closest -a file_peaks.narrowPeak -b path/genes.tss.bed  > file_closestTSS.txt

The error says: * ERROR: illegal character '.' found in integer conversion of string "3216969.". Exiting... I generated genes.tss.bed file from genes.gfp file which i found in Annotations of iGenome mm10

awk 'BEGIN {FS=OFS="\t"} { if($7=="+"){tss=$4-1} else { tss=$5} print $1,tss, tss+1 ".", ".", $7, $9}' path/genes.gtf > path/genes.tss.bed

Could anyone help me please? Thank you

ChIP-Seq gene • 5.6k views
ADD COMMENT
1
Entering edit mode

It seems that a dot '.' is on the wrong place (where an integer is expected).

Show how your bed file looks like, maybe it becomes clear where that might be.

ADD REPLY
0
Entering edit mode

I'm not sure, but dont you need comma aftertss+1`?

ADD REPLY
0
Entering edit mode

what is the output of head path/genes.tss.bed ?

ADD REPLY
0
Entering edit mode
head /home/s1469622/dstore/Reference_genomes/Mus_musculus/UCSC/mm10/Annotation/Archives/archive-2015-07-17-14-33-26/Genes/genes.tss.bed
chr1    3216968 3216969.        .       -       gene_id "Xkr4"; gene_name "Xkr4"; p_id "P15391"; transcript_id "NM_001011874"; tss_id "TSS27105";
chr1    3216024 3216025.        .       -       gene_id "Xkr4"; gene_name "Xkr4"; p_id "P15391"; transcript_id "NM_001011874"; tss_id "TSS27105";
chr1    3216968 3216969.        .       -       gene_id "Xkr4"; gene_name "Xkr4"; p_id "P15391"; transcript_id "NM_001011874"; tss_id "TSS27105";
chr1    3421901 3421902.        .       -       gene_id "Xkr4"; gene_name "Xkr4"; p_id "P15391"; transcript_id "NM_001011874"; tss_id "TSS27105";
chr1    3421901 3421902.        .       -       gene_id "Xkr4"; gene_name "Xkr4"; p_id "P15391"; transcript_id "NM_001011874"; tss_id "TSS27105";
chr1    3671348 3671349.        .       -       gene_id "Xkr4"; gene_name "Xkr4"; p_id "P15391"; transcript_id "NM_001011874"; tss_id "TSS27105";
chr1    3671498 3671499.        .       -       gene_id "Xkr4"; gene_name "Xkr4"; p_id "P15391"; transcript_id "NM_001011874"; tss_id "TSS27105";
chr1    3671348 3671349.        .       -       gene_id "Xkr4"; gene_name "Xkr4"; p_id "P15391"; transcript_id "NM_001011874"; tss_id "TSS27105";
chr1    4293012 4293013.        .       -       gene_id "Rp1"; gene_name "Rp1"; p_id "P17361"; transcript_id "NM_001195662"; tss_id "TSS6138";
chr1    4292983 4292984.        .       -       gene_id "Rp1"; gene_name "Rp1"; p_id "P17361"; transcript_id "NM_001195662"; tss_id "TSS6138";
ADD REPLY
1
Entering edit mode

remove the . attached with end coordinates.

Try following

awk 'BEGIN {FS=OFS="\t"} { if($7=="+"){tss=$4-1} else { tss=$5} print $1,tss, tss+1, ".", $7, $9}' path/genes.gtf > path/genes.tss.bed

ADD REPLY
0
Entering edit mode

I get this error when I use bedtools afterwards:

Error: Sorted input specified, but the file /home/s1469622/dstore/Reference_genomes/Mus_musculus/UCSC/mm10/Annotation/Archives/archive-2015-07-17-14-33-26/Genes/genes.tss.bed has the following out of order record
chr1    3216024 3216025 .       -       gene_id "Xkr4"; gene_name "Xkr4"; p_id "P15391"; transcript_id "NM_001011874"; tss_id "TSS27105";
ADD REPLY
3
Entering edit mode

You need to run bedtools sort on this.

ADD REPLY

Login before adding your answer.

Traffic: 1960 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6