Bedtools sort -faidx producing empty output
1
0
Entering edit mode
22 months ago
jamie.pike ▴ 80

Hello,

I am trying to sort a GFF file using the FASTA index output from samtools faidx. Both of my input files have content, but when I use the command below to sort my GFF, I get no output to the screen or file.

bedtools sort -faidx F._oxysporum_f._sp._cubense_UK0001.fna.fai -i ../Mimps/F._oxysporum_f._sp._cubense_UK0001.fna_mimp_hits.gff

I have tried searching for similar posts but can't find anything. It's tricky to solve because there are no error messages and I can't find any mistakes with the input files. I have tried just using a .txt file of only the contig headers instead, but that also provides an empty output. I'm not sure what to try next, does anyone have any suggestions or can you spot any mistakes?

Thank you : )

I using bedtools version v2.25.0

Input .fai file:

VMNF01000001.1_Fusarium_oxysporum_f._sp._cubense_strain_TR4_isolate_UK0001_scf_28419_1,_whole_genome_shotgun_sequence   44428   119 80  81
VMNF01000010.1_Fusarium_oxysporum_f._sp._cubense_strain_TR4_isolate_UK0001_scf_28419_10,_whole_genome_shotgun_sequence  34672   45223   80  81
VMNF01000011.1_Fusarium_oxysporum_f._sp._cubense_strain_TR4_isolate_UK0001_scf_28419_11,_whole_genome_shotgun_sequence  3303011 80449   80  81
VMNF01000012.1_Fusarium_oxysporum_f._sp._cubense_strain_TR4_isolate_UK0001_scf_28419_12,_whole_genome_shotgun_sequence  3488972 3424868 80  81
VMNF01000013.1_Fusarium_oxysporum_f._sp._cubense_strain_TR4_isolate_UK0001_scf_28419_13,_whole_genome_shotgun_sequence  1244651 6957573 80  81
VMNF01000014.1_Fusarium_oxysporum_f._sp._cubense_strain_TR4_isolate_UK0001_scf_28419_14,_whole_genome_shotgun_sequence  3744637 8217903 80  81
VMNF01000015.1_Fusarium_oxysporum_f._sp._cubense_strain_TR4_isolate_UK0001_scf_28419_15,_whole_genome_shotgun_sequence  2630465 12009468    80  81
VMNF01000002.1_Fusarium_oxysporum_f._sp._cubense_strain_TR4_isolate_UK0001_scf_28419_2,_whole_genome_shotgun_sequence   115573  14672933    80  81
VMNF01000003.1_Fusarium_oxysporum_f._sp._cubense_strain_TR4_isolate_UK0001_scf_28419_3,_whole_genome_shotgun_sequence   4819209 14790070    80  81
VMNF01000004.1_Fusarium_oxysporum_f._sp._cubense_strain_TR4_isolate_UK0001_scf_28419_4,_whole_genome_shotgun_sequence   5233353 19669639    80  81
VMNF01000005.1_Fusarium_oxysporum_f._sp._cubense_strain_TR4_isolate_UK0001_scf_28419_5,_whole_genome_shotgun_sequence   5758795 24968528    80  81
VMNF01000006.1_Fusarium_oxysporum_f._sp._cubense_strain_TR4_isolate_UK0001_scf_28419_6,_whole_genome_shotgun_sequence   4265398 30799427    80  81
VMNF01000007.1_Fusarium_oxysporum_f._sp._cubense_strain_TR4_isolate_UK0001_scf_28419_7,_whole_genome_shotgun_sequence   6580744 35118262    80  81
VMNF01000008.1_Fusarium_oxysporum_f._sp._cubense_strain_TR4_isolate_UK0001_scf_28419_8,_whole_genome_shotgun_sequence   2830195 41781385    80  81
VMNF01000009.1_Fusarium_oxysporum_f._sp._cubense_strain_TR4_isolate_UK0001_scf_28419_9,_whole_genome_shotgun_sequence   4494293 44647077    80  81 

Input GFF:

VMNF01000002.1_Fusarium_oxysporum_f._sp._cubense_strain_TR4_isolate_UK0001_scf_28419_2,_whole_genome_shotgun_sequence   bed2gff mimps   16238   21460   .   SHORT_ID=mimps1;    ID=.;
VMNF01000002.1_Fusarium_oxysporum_f._sp._cubense_strain_TR4_isolate_UK0001_scf_28419_2,_whole_genome_shotgun_sequence   bed2gff mimps   16239   21460   .   SHORT_ID=mimps2;    ID=.;
VMNF01000002.1_Fusarium_oxysporum_f._sp._cubense_strain_TR4_isolate_UK0001_scf_28419_2,_whole_genome_shotgun_sequence   bed2gff mimps   16537   21612   .   SHORT_ID=mimps3;    ID=.;
VMNF01000002.1_Fusarium_oxysporum_f._sp._cubense_strain_TR4_isolate_UK0001_scf_28419_2,_whole_genome_shotgun_sequence   bed2gff mimps   16546   21613   .   SHORT_ID=mimps4;    ID=.;
VMNF01000002.1_Fusarium_oxysporum_f._sp._cubense_strain_TR4_isolate_UK0001_scf_28419_2,_whole_genome_shotgun_sequence   bed2gff mimps   16868   21890   .   SHORT_ID=mimps5;    ID=.;
VMNF01000002.1_Fusarium_oxysporum_f._sp._cubense_strain_TR4_isolate_UK0001_scf_28419_2,_whole_genome_shotgun_sequence   bed2gff mimps   16869   21890   .   SHORT_ID=mimps6;    ID=.;
VMNF01000002.1_Fusarium_oxysporum_f._sp._cubense_strain_TR4_isolate_UK0001_scf_28419_2,_whole_genome_shotgun_sequence   bed2gff mimps   93675   98697   .   SHORT_ID=mimps7;    ID=.;
VMNF01000002.1_Fusarium_oxysporum_f._sp._cubense_strain_TR4_isolate_UK0001_scf_28419_2,_whole_genome_shotgun_sequence   bed2gff mimps   93675   98696   .   SHORT_ID=mimps8;    ID=.;
VMNF01000002.1_Fusarium_oxysporum_f._sp._cubense_strain_TR4_isolate_UK0001_scf_28419_2,_whole_genome_shotgun_sequence   bed2gff mimps   93951   99018   .   SHORT_ID=mimps9;    ID=.;
VMNF01000002.1_Fusarium_oxysporum_f._sp._cubense_strain_TR4_isolate_UK0001_scf_28419_2,_whole_genome_shotgun_sequence   bed2gff mimps   93952   99027   .   SHORT_ID=mimps10;   ID=.;
bedtools GFF • 776 views
ADD COMMENT
1
Entering edit mode
22 months ago

works on my machine. Check the delimiters, the file CRLF status

$ echo fai && head -n 5 jeter.fa.fai && echo gff && head -n 5 jeter.gff && echo sort && bedtools sort -faidx jeter.fa.fai -i jeter.gff
fai
VMNF01000001.1_Fusarium_oxysporum_f._sp._cubense_strain_TR4_isolate_UK0001_scf_28419_1,_whole_genome_shotgun_sequence   44428   119 80  81
VMNF01000010.1_Fusarium_oxysporum_f._sp._cubense_strain_TR4_isolate_UK0001_scf_28419_10,_whole_genome_shotgun_sequence  34672   45223   80  81
VMNF01000011.1_Fusarium_oxysporum_f._sp._cubense_strain_TR4_isolate_UK0001_scf_28419_11,_whole_genome_shotgun_sequence  3303011 80449   80  81
VMNF01000012.1_Fusarium_oxysporum_f._sp._cubense_strain_TR4_isolate_UK0001_scf_28419_12,_whole_genome_shotgun_sequence  3488972 3424868 80  81
VMNF01000013.1_Fusarium_oxysporum_f._sp._cubense_strain_TR4_isolate_UK0001_scf_28419_13,_whole_genome_shotgun_sequence  1244651 6957573 80  81
gff
VMNF01000002.1_Fusarium_oxysporum_f._sp._cubense_strain_TR4_isolate_UK0001_scf_28419_2,_whole_genome_shotgun_sequence   bed2gff mimps   16238   21460   .   SHORT_ID=mimps1;    ID=.;
VMNF01000002.1_Fusarium_oxysporum_f._sp._cubense_strain_TR4_isolate_UK0001_scf_28419_2,_whole_genome_shotgun_sequence   bed2gff mimps   16239   21460   .   SHORT_ID=mimps2;    ID=.;
VMNF01000002.1_Fusarium_oxysporum_f._sp._cubense_strain_TR4_isolate_UK0001_scf_28419_2,_whole_genome_shotgun_sequence   bed2gff mimps   16537   21612   .   SHORT_ID=mimps3;    ID=.;
VMNF01000002.1_Fusarium_oxysporum_f._sp._cubense_strain_TR4_isolate_UK0001_scf_28419_2,_whole_genome_shotgun_sequence   bed2gff mimps   16546   21613   .   SHORT_ID=mimps4;    ID=.;
VMNF01000002.1_Fusarium_oxysporum_f._sp._cubense_strain_TR4_isolate_UK0001_scf_28419_2,_whole_genome_shotgun_sequence   bed2gff mimps   16868   21890   .   SHORT_ID=mimps5;    ID=.;
sort
VMNF01000002.1_Fusarium_oxysporum_f._sp._cubense_strain_TR4_isolate_UK0001_scf_28419_2,_whole_genome_shotgun_sequence   bed2gff mimps   16238   21460   .   SHORT_ID=mimps1;    ID=.;
VMNF01000002.1_Fusarium_oxysporum_f._sp._cubense_strain_TR4_isolate_UK0001_scf_28419_2,_whole_genome_shotgun_sequence   bed2gff mimps   16239   21460   .   SHORT_ID=mimps2;    ID=.;
VMNF01000002.1_Fusarium_oxysporum_f._sp._cubense_strain_TR4_isolate_UK0001_scf_28419_2,_whole_genome_shotgun_sequence   bed2gff mimps   16537   21612   .   SHORT_ID=mimps3;    ID=.;
VMNF01000002.1_Fusarium_oxysporum_f._sp._cubense_strain_TR4_isolate_UK0001_scf_28419_2,_whole_genome_shotgun_sequence   bed2gff mimps   16546   21613   .   SHORT_ID=mimps4;    ID=.;
VMNF01000002.1_Fusarium_oxysporum_f._sp._cubense_strain_TR4_isolate_UK0001_scf_28419_2,_whole_genome_shotgun_sequence   bed2gff mimps   16868   21890   .   SHORT_ID=mimps5;    ID=.;
VMNF01000002.1_Fusarium_oxysporum_f._sp._cubense_strain_TR4_isolate_UK0001_scf_28419_2,_whole_genome_shotgun_sequence   bed2gff mimps   16869   21890   .   SHORT_ID=mimps6;    ID=.;
VMNF01000002.1_Fusarium_oxysporum_f._sp._cubense_strain_TR4_isolate_UK0001_scf_28419_2,_whole_genome_shotgun_sequence   bed2gff mimps   93675   98697   .   SHORT_ID=mimps7;    ID=.;
VMNF01000002.1_Fusarium_oxysporum_f._sp._cubense_strain_TR4_isolate_UK0001_scf_28419_2,_whole_genome_shotgun_sequence   bed2gff mimps   93675   98696   .   SHORT_ID=mimps8;    ID=.;
VMNF01000002.1_Fusarium_oxysporum_f._sp._cubense_strain_TR4_isolate_UK0001_scf_28419_2,_whole_genome_shotgun_sequence   bed2gff mimps   93951   99018   .   SHORT_ID=mimps9;    ID=.;
VMNF01000002.1_Fusarium_oxysporum_f._sp._cubense_strain_TR4_isolate_UK0001_scf_28419_2,_whole_genome_shotgun_sequence   bed2gff mimps   93952   99027   .   SHORT_ID=mimps10;   ID=.;
ADD COMMENT
0
Entering edit mode

Thank you for your help with this : )

Both seem to have the standard unix LF - if I have understood CRLF status correctly - and are tab-deliminated. Do the same endings appear on your machine?

(hmmerEnv) u1983390@vettel:Mimps$ cat -te F._oxysporum_f._sp._cubense_UK0001.fna_mimp_hits.gff | head
VMNF01000002.1_Fusarium_oxysporum_f._sp._cubense_strain_TR4_isolate_UK0001_scf_28419_2,_whole_genome_shotgun_sequence^Ibed2gff^Imimp_region^I16238^I21460^I.^I+^I.^ISHORT_ID=mimp_region1;^IID=.;$
VMNF01000002.1_Fusarium_oxysporum_f._sp._cubense_strain_TR4_isolate_UK0001_scf_28419_2,_whole_genome_shotgun_sequence^Ibed2gff^Imimp_region^I16239^I21460^I.^I-^I.^ISHORT_ID=mimp_region2;^IID=.;$
VMNF01000002.1_Fusarium_oxysporum_f._sp._cubense_strain_TR4_isolate_UK0001_scf_28419_2,_whole_genome_shotgun_sequence^Ibed2gff^Imimp_region^I16537^I21612^I.^I+^I.^ISHORT_ID=mimp_region3;^IID=.;$
VMNF01000002.1_Fusarium_oxysporum_f._sp._cubense_strain_TR4_isolate_UK0001_scf_28419_2,_whole_genome_shotgun_sequence^Ibed2gff^Imimp_region^I16546^I21613^I.^I-^I.^ISHORT_ID=mimp_region4;^IID=.;$
VMNF01000002.1_Fusarium_oxysporum_f._sp._cubense_strain_TR4_isolate_UK0001_scf_28419_2,_whole_genome_shotgun_sequence^Ibed2gff^Imimp_region^I16868^I21890^I.^I+^I.^ISHORT_ID=mimp_region5;^IID=.;$
VMNF01000002.1_Fusarium_oxysporum_f._sp._cubense_strain_TR4_isolate_UK0001_scf_28419_2,_whole_genome_shotgun_sequence^Ibed2gff^Imimp_region^I16869^I21890^I.^I-^I.^ISHORT_ID=mimp_region6;^IID=.;$
VMNF01000002.1_Fusarium_oxysporum_f._sp._cubense_strain_TR4_isolate_UK0001_scf_28419_2,_whole_genome_shotgun_sequence^Ibed2gff^Imimp_region^I93675^I98697^I.^I-^I.^ISHORT_ID=mimp_region7;^IID=.;$
VMNF01000002.1_Fusarium_oxysporum_f._sp._cubense_strain_TR4_isolate_UK0001_scf_28419_2,_whole_genome_shotgun_sequence^Ibed2gff^Imimp_region^I93675^I98696^I.^I+^I.^ISHORT_ID=mimp_region8;^IID=.;$
VMNF01000002.1_Fusarium_oxysporum_f._sp._cubense_strain_TR4_isolate_UK0001_scf_28419_2,_whole_genome_shotgun_sequence^Ibed2gff^Imimp_region^I93951^I99018^I.^I+^I.^ISHORT_ID=mimp_region9;^IID=.;$
VMNF01000002.1_Fusarium_oxysporum_f._sp._cubense_strain_TR4_isolate_UK0001_scf_28419_2,_whole_genome_shotgun_sequence^Ibed2gff^Imimp_region^I93952^I99027^I.^I-^I.^ISHORT_ID=mimp_region10;^IID=.;$


(hmmerEnv) u1983390@vettel:Mimps$ cat -te ../Index/F._oxysporum_f._sp._cubense_UK0001.fna.fai 
    VMNF01000001.1_Fusarium_oxysporum_f._sp._cubense_strain_TR4_isolate_UK0001_scf_28419_1,_whole_genome_shotgun_sequence^I44428^I119^I80^I81$
    VMNF01000010.1_Fusarium_oxysporum_f._sp._cubense_strain_TR4_isolate_UK0001_scf_28419_10,_whole_genome_shotgun_sequence^I34672^I45223^I80^I81$
    VMNF01000011.1_Fusarium_oxysporum_f._sp._cubense_strain_TR4_isolate_UK0001_scf_28419_11,_whole_genome_shotgun_sequence^I3303011^I80449^I80^I81$
    VMNF01000012.1_Fusarium_oxysporum_f._sp._cubense_strain_TR4_isolate_UK0001_scf_28419_12,_whole_genome_shotgun_sequence^I3488972^I3424868^I80^I81$
    VMNF01000013.1_Fusarium_oxysporum_f._sp._cubense_strain_TR4_isolate_UK0001_scf_28419_13,_whole_genome_shotgun_sequence^I1244651^I6957573^I80^I81$
    VMNF01000014.1_Fusarium_oxysporum_f._sp._cubense_strain_TR4_isolate_UK0001_scf_28419_14,_whole_genome_shotgun_sequence^I3744637^I8217903^I80^I81$
    VMNF01000015.1_Fusarium_oxysporum_f._sp._cubense_strain_TR4_isolate_UK0001_scf_28419_15,_whole_genome_shotgun_sequence^I2630465^I12009468^I80^I81$
    VMNF01000002.1_Fusarium_oxysporum_f._sp._cubense_strain_TR4_isolate_UK0001_scf_28419_2,_whole_genome_shotgun_sequence^I115573^I14672933^I80^I81$
    VMNF01000003.1_Fusarium_oxysporum_f._sp._cubense_strain_TR4_isolate_UK0001_scf_28419_3,_whole_genome_shotgun_sequence^I4819209^I14790070^I80^I81$
    VMNF01000004.1_Fusarium_oxysporum_f._sp._cubense_strain_TR4_isolate_UK0001_scf_28419_4,_whole_genome_shotgun_sequence^I5233353^I19669639^I80^I81$
    VMNF01000005.1_Fusarium_oxysporum_f._sp._cubense_strain_TR4_isolate_UK0001_scf_28419_5,_whole_genome_shotgun_sequence^I5758795^I24968528^I80^I81$
    VMNF01000006.1_Fusarium_oxysporum_f._sp._cubense_strain_TR4_isolate_UK0001_scf_28419_6,_whole_genome_shotgun_sequence^I4265398^I30799427^I80^I81$
    VMNF01000007.1_Fusarium_oxysporum_f._sp._cubense_strain_TR4_isolate_UK0001_scf_28419_7,_whole_genome_shotgun_sequence^I6580744^I35118262^I80^I81$
    VMNF01000008.1_Fusarium_oxysporum_f._sp._cubense_strain_TR4_isolate_UK0001_scf_28419_8,_whole_genome_shotgun_sequence^I2830195^I41781385^I80^I81$
    VMNF01000009.1_Fusarium_oxysporum_f._sp._cubense_strain_TR4_isolate_UK0001_scf_28419_9,_whole_genome_shotgun_sequence^I4494293^I44647077^I80^I81$
ADD REPLY
1
Entering edit mode

I solved it! I'll leave the post here as a lesson: double-check that the input file is in the correct format! As Pierre Lindenbaum highlighted, it was related to delimiters, but not the CRLF status. There was an extra tab in the attributes section of the GFF file, which I think was confusing bedtools. I have changed the GFF and it now works.

Previous line in GFF:

VMNF01000002.1_Fusarium_oxysporum_f._sp._cubense_strain_TR4_isolate_UK0001_scf_28419_2,_whole_genome_shotgun_sequence   bed2gff mimp_region   16238   21460   .   SHORT_ID=mimp_region1;    ID=.;

Updated GFF:

VMNF01000002.1_Fusarium_oxysporum_f._sp._cubense_strain_TR4_isolate_UK0001_scf_28419_2,_whole_genome_shotgun_sequence   bed2gff mimp_region 16238   21460    SHORT_ID=mimp_region1;ID=.;
ADD REPLY

Login before adding your answer.

Traffic: 1783 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6