Hello,
I am trying to sort a GFF file using the FASTA index output from samtools faidx. Both of my input files have content, but when I use the command below to sort my GFF, I get no output to the screen or file.
bedtools sort -faidx F._oxysporum_f._sp._cubense_UK0001.fna.fai -i ../Mimps/F._oxysporum_f._sp._cubense_UK0001.fna_mimp_hits.gff
I have tried searching for similar posts but can't find anything. It's tricky to solve because there are no error messages and I can't find any mistakes with the input files. I have tried just using a .txt file of only the contig headers instead, but that also provides an empty output. I'm not sure what to try next, does anyone have any suggestions or can you spot any mistakes?
Thank you : )
I using bedtools version v2.25.0
Input .fai file:
VMNF01000001.1_Fusarium_oxysporum_f._sp._cubense_strain_TR4_isolate_UK0001_scf_28419_1,_whole_genome_shotgun_sequence 44428 119 80 81
VMNF01000010.1_Fusarium_oxysporum_f._sp._cubense_strain_TR4_isolate_UK0001_scf_28419_10,_whole_genome_shotgun_sequence 34672 45223 80 81
VMNF01000011.1_Fusarium_oxysporum_f._sp._cubense_strain_TR4_isolate_UK0001_scf_28419_11,_whole_genome_shotgun_sequence 3303011 80449 80 81
VMNF01000012.1_Fusarium_oxysporum_f._sp._cubense_strain_TR4_isolate_UK0001_scf_28419_12,_whole_genome_shotgun_sequence 3488972 3424868 80 81
VMNF01000013.1_Fusarium_oxysporum_f._sp._cubense_strain_TR4_isolate_UK0001_scf_28419_13,_whole_genome_shotgun_sequence 1244651 6957573 80 81
VMNF01000014.1_Fusarium_oxysporum_f._sp._cubense_strain_TR4_isolate_UK0001_scf_28419_14,_whole_genome_shotgun_sequence 3744637 8217903 80 81
VMNF01000015.1_Fusarium_oxysporum_f._sp._cubense_strain_TR4_isolate_UK0001_scf_28419_15,_whole_genome_shotgun_sequence 2630465 12009468 80 81
VMNF01000002.1_Fusarium_oxysporum_f._sp._cubense_strain_TR4_isolate_UK0001_scf_28419_2,_whole_genome_shotgun_sequence 115573 14672933 80 81
VMNF01000003.1_Fusarium_oxysporum_f._sp._cubense_strain_TR4_isolate_UK0001_scf_28419_3,_whole_genome_shotgun_sequence 4819209 14790070 80 81
VMNF01000004.1_Fusarium_oxysporum_f._sp._cubense_strain_TR4_isolate_UK0001_scf_28419_4,_whole_genome_shotgun_sequence 5233353 19669639 80 81
VMNF01000005.1_Fusarium_oxysporum_f._sp._cubense_strain_TR4_isolate_UK0001_scf_28419_5,_whole_genome_shotgun_sequence 5758795 24968528 80 81
VMNF01000006.1_Fusarium_oxysporum_f._sp._cubense_strain_TR4_isolate_UK0001_scf_28419_6,_whole_genome_shotgun_sequence 4265398 30799427 80 81
VMNF01000007.1_Fusarium_oxysporum_f._sp._cubense_strain_TR4_isolate_UK0001_scf_28419_7,_whole_genome_shotgun_sequence 6580744 35118262 80 81
VMNF01000008.1_Fusarium_oxysporum_f._sp._cubense_strain_TR4_isolate_UK0001_scf_28419_8,_whole_genome_shotgun_sequence 2830195 41781385 80 81
VMNF01000009.1_Fusarium_oxysporum_f._sp._cubense_strain_TR4_isolate_UK0001_scf_28419_9,_whole_genome_shotgun_sequence 4494293 44647077 80 81
Input GFF:
VMNF01000002.1_Fusarium_oxysporum_f._sp._cubense_strain_TR4_isolate_UK0001_scf_28419_2,_whole_genome_shotgun_sequence bed2gff mimps 16238 21460 . SHORT_ID=mimps1; ID=.;
VMNF01000002.1_Fusarium_oxysporum_f._sp._cubense_strain_TR4_isolate_UK0001_scf_28419_2,_whole_genome_shotgun_sequence bed2gff mimps 16239 21460 . SHORT_ID=mimps2; ID=.;
VMNF01000002.1_Fusarium_oxysporum_f._sp._cubense_strain_TR4_isolate_UK0001_scf_28419_2,_whole_genome_shotgun_sequence bed2gff mimps 16537 21612 . SHORT_ID=mimps3; ID=.;
VMNF01000002.1_Fusarium_oxysporum_f._sp._cubense_strain_TR4_isolate_UK0001_scf_28419_2,_whole_genome_shotgun_sequence bed2gff mimps 16546 21613 . SHORT_ID=mimps4; ID=.;
VMNF01000002.1_Fusarium_oxysporum_f._sp._cubense_strain_TR4_isolate_UK0001_scf_28419_2,_whole_genome_shotgun_sequence bed2gff mimps 16868 21890 . SHORT_ID=mimps5; ID=.;
VMNF01000002.1_Fusarium_oxysporum_f._sp._cubense_strain_TR4_isolate_UK0001_scf_28419_2,_whole_genome_shotgun_sequence bed2gff mimps 16869 21890 . SHORT_ID=mimps6; ID=.;
VMNF01000002.1_Fusarium_oxysporum_f._sp._cubense_strain_TR4_isolate_UK0001_scf_28419_2,_whole_genome_shotgun_sequence bed2gff mimps 93675 98697 . SHORT_ID=mimps7; ID=.;
VMNF01000002.1_Fusarium_oxysporum_f._sp._cubense_strain_TR4_isolate_UK0001_scf_28419_2,_whole_genome_shotgun_sequence bed2gff mimps 93675 98696 . SHORT_ID=mimps8; ID=.;
VMNF01000002.1_Fusarium_oxysporum_f._sp._cubense_strain_TR4_isolate_UK0001_scf_28419_2,_whole_genome_shotgun_sequence bed2gff mimps 93951 99018 . SHORT_ID=mimps9; ID=.;
VMNF01000002.1_Fusarium_oxysporum_f._sp._cubense_strain_TR4_isolate_UK0001_scf_28419_2,_whole_genome_shotgun_sequence bed2gff mimps 93952 99027 . SHORT_ID=mimps10; ID=.;
Thank you for your help with this : )
Both seem to have the standard unix LF - if I have understood CRLF status correctly - and are tab-deliminated. Do the same endings appear on your machine?
I solved it! I'll leave the post here as a lesson: double-check that the input file is in the correct format! As Pierre Lindenbaum highlighted, it was related to delimiters, but not the CRLF status. There was an extra tab in the attributes section of the GFF file, which I think was confusing bedtools. I have changed the GFF and it now works.
Previous line in GFF:
Updated GFF: