failing to convert gtf to bed by bedops
2
1
Entering edit mode
9.0 years ago
zizigolu ★ 4.3k

Hi,

I am using fedora. I have a merged.gtf by cuffdif that I am going to convert to bed because I need something like below (the first row of a bed file) that I have after mapping with bowtie2 and downstream tool to bed

YAL001C    0    31    SRR1944914.13670510    42    +

but here I used tophat and cufflinks and now I have a gtf that is the first row of my gtf file

1    Cufflinks    exon    3631    3913    .    +    .    gene_id "XLOC_000001"; transcript_id "TCONS_00000001"; exon_number "1"; gene_name "NAC001"; oId "AT1G01010.1"; nearest_ref "AT1G01010.1"; class_code "="; tss_id

I used this command to convert to bed format but I got error

[izadi@lbox161 bin]$ gtf2bed < merged.gtf | sort-bed - > file.bed
Segmentation fault
[izadi@lbox161 bin]$

Thank you for your help

bedops gtf bed • 5.3k views
ADD COMMENT
0
Entering edit mode

First, why do you need sort-bed if gtf2bed already sorts the output?

Tip: By default, all conversion scripts now output sorted BED data ready for use with BEDOPS utilities. If you do not want to sort converted output, use the --do-not-sort option. Run the script with the --help option for more details.

Second, if you get an error, break apart your multiple commands and find out which one is giving you the error.

ADD REPLY
0
Entering edit mode

but i used only one command then how i can break that apart?

ADD REPLY
0
Entering edit mode

see |.

ADD REPLY
0
Entering edit mode

Fereshteh, as noted in my comment, the pre-compiled binaries may not work on all Linux systems. You can compile and install by hand with a few commands - instructions are posted here.

ADD REPLY
1
Entering edit mode
9.0 years ago
A. Domingues ★ 2.7k

I think the issue might be in the way you are piping things. This appeared to have worked for me:

cat merged.gtf | gtf2bed - | sort-bed - > file.bed

Though the GTF used in my test was not the output of cufflinks.

ADD COMMENT
0
Entering edit mode

thank you,

but

[izadi@lbox161 bin]$ cat merged.gtf | gtf2bed | sort-bed - > file.bed
Segmentation fault
[izadi@lbox161 bin]$

and when I just use

cat merged.gtf | gtf2bed > file.bed

nor error nor readable output

ADD REPLY
1
Entering edit mode
9.0 years ago
cyril-cros ▴ 950

For some reasons unknown to me, Bedtools/Bedops never liked the output of gtf2bed with a Cufflinks style input. More precisely, the 6 first columns (up to the strand column) are ok but not the rest (with fields separated by ";"). Just use:

cat merged.gtf | gtf2bed  | cut -f-6 | someCommand

gtf2bed will use the gene_id field as the output bed 4th column (unique identifiers for each interval).

You can then use commands like awk and join if you need to merge the rest of your information. It is a dirty workaround, but it works.

ADD COMMENT
0
Entering edit mode

sorry,

I did like below but the output is unreadable

cat merged.gtf | gtf2bed | cut -f-6 > file1.bed
ADD REPLY
1
Entering edit mode

I tried one of my own cufflinks file. You are right. Gtf2bed and convert2bed no longer seem to work. I had a really detailed and tested workflow which now fails at this step. I am currently looking if some recent change introduced a bug.

ADD REPLY
0
Entering edit mode

anyway thank you

ADD REPLY
0
Entering edit mode

I would be interested in fixing any problems with this tool. If you can post part of your file somewhere, and describe the platform you are running this on and what error comes up, that would help me do some debugging on my end.

ADD REPLY
0
Entering edit mode

Input (Cufflinks, from the sup. material of a paper on olfactory receptors):

1    reconstructed_transcript    exon    92571567    92571659    .    +    .    gene_id "CUFFORG_0001"; transcript_id "CUFFORT_0001"; exon_number "1"; gene_name "Olfr1413";
1    reconstructed_transcript    exon    92573464    92574374    .    +    .    gene_id "CUFFORG_0001"; transcript_id "CUFFORT_0001"; exon_number "2"; gene_name "Olfr1413"; ensembl_id "ENSMUSG00000058904";
1    ENSEMBL_transcript    exon    92573125    92574206    .    +    .    gene_id "CUFFORG_0001"; transcript_id "ENSMUST00000074859"; exon_number "1"; gene_name "Olfr1413"; ensembl_id "ENSMUSG00000058904";

I am using Archlinux, PKGBUILD is https://aur.archlinux.org/cgit/aur.git/tree/PKGBUILD?h=bedops. That is, it downloads the 2.14.4 Github version of Bedops, uses sed to replace all explicit mention of /usr/bin/env python by /usr/bin/env python2 and installs the binaries in /usr/bin.

Running gtf2bed or convert2bed fails silently (no output or message, but I do get the usage display with a bad command).

ADD REPLY
0
Entering edit mode

Thanks. Do you have a link to the paper or doi? I can grab the file from the supplementary material in there.

ADD REPLY
0
Entering edit mode
ADD REPLY
0
Entering edit mode

Unfortunately, I'm not able to reproduce this bug with the given file under OS X 10.11 running BEDOPS 2.4.14 tools. I get a complete BED output, with all the same records. I'll see if I can set up an Archlinux VM at work sometime in the next few days to try to reproduce it there.

In the meantime, instead of using that third-party PKGBUILD script, you (and others) could also compile the toolkit manually, which isn't too hard to do. The pre-compiled binaries may require a newer kernel than what is on the variety of Linux systems out there, and compiling by hand addresses that issue.

ADD REPLY
1
Entering edit mode

I don't know if it help with the debbuding, but I download the paper's GTF and

cat journal.pgen.1004593.s017.GTF | gtf2bed | sort-bed - | head

runs without a itch.

bedops version:

gtf2bed --version
convert2bed -i gtf
  version: 2.4.12
  author:  Alex Reynolds

System:

Linux 3.16.0-0.bpo.4-amd64 #1 SMP Debian 3.16.7-ckt11-1~bpo70+1 (2015-06-08) x86_64
ADD REPLY
1
Entering edit mode

I installed the current version of Archlinux:

# uname -a
Linux archlinux-retina 4.2.5-1-ARCH #1 SMP PREEMPT Tue Oct 27 08:13:28 CET 2015 x86_64 GNU/Linux

After installing git and downloading BEDOPS source, I compiled it with the GCC 5.2.0 distribution that is packaged with the base-devel package in Archlinux.

I then converted the PLoS GTF file to BED via gtf2bed without any errors. I get the same number of records and data are in the correct columnar order.

If you have problems, my advice would be to download the source and recompile/reinstall binaries via the instructions on the BEDOPS docs site.

ADD REPLY
0
Entering edit mode

sorry,

I did like you mentioned but

/bin/ld: cannot find -lstdc++
collect2: error: ld returned 1 exit status
Makefile:40: recipe for target 'build' failed
make[3]: *** [build] Error 1
make[3]: Leaving directory '/usr/data/nfs6/izadi/bedtools2/bin/bedops/applications/bed/bedmap/src'
/usr/data/nfs6/izadi/bedtools2/bin/bedops/system.mk/Makefile.linux:29: recipe for target 'applications/bed/bedmap/src' failed
make[2]: *** [applications/bed/bedmap/src] Error 2
make[2]: Leaving directory '/usr/data/nfs6/izadi/bedtools2/bin/bedops'
system.mk/Makefile.linux:20: recipe for target 'default' failed
make[1]: *** [default] Error 2
make[1]: Leaving directory '/usr/data/nfs6/izadi/bedtools2/bin/bedops'
Makefile:14: recipe for target 'default' failed
make: *** [default] Error 2
[izadi@lbox161 bedops]$
ADD REPLY
1
Entering edit mode

You may need to install a newer version of GCC tools. See the compile instructions for more details.

ADD REPLY
1
Entering edit mode

If you are running a RH-like Linux, you may also want to install static versions of common libraries:

$ sudo yum install libstdc++-static
$ sudo yum install glibc-static

This seems to fix some cases of cannot find -lXYZ error messages.

ADD REPLY

Login before adding your answer.

Traffic: 1514 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6