Samtool (Faidx) - Signal 11 Termination Error
1
0
Entering edit mode
11.3 years ago
IsmailM ▴ 110

Hi, I am trying to use the samtools (faidx) to extract certain sequences from a fasta file (actually a .txt file) using sequence IDs in a different file. However it is giving the following error:

[fai_load] build FASTA index.
xargs: samtools: terminated by signal 11

This the shell command I am using:

cut -c 2- output.txt | xargs -n 1 samtools faidx motif_search.txt

The files:

output.txt - containing specific sequence IDs only (example text)

>CATH_MOUSE
>CATH_HUMAN

motif_search.txt - containing sequence IDs and the sequence (example text)

>CATH_RAT
MWTALPLLCAGAWLLSAGATAELTVNAIEKFHFTSWMKQHQKTYSSREYSHRLQVFANNWRKIQAHNQRN
HTFKMGLNQFSDMSFAEIKHKYLWSEPQNCSATKSNYLRGTGPYPSSMDWRKKGNVVSPVKNQGACGSCW
>CATH_MOUSE
TFSTTGALESAVAIASGKMMTLAEQQLVDCAQNFNNHGCQGGLPSQAFEYILYNKGIMGEDSYPYIGKNG
QCKFNPEKAVAFVKNVVNITLNDEAAMVEAVALYNPVSFAFEVTEDFMMYKSGVYSSNSCHKTPDKVNHA
>CATH_HUMAN
MNPTLILAAFCLGIASATLTFDHSLEAQWTKWKAMHNRLYGMNEEGWRRAVWEKNMKMIELHNQEYREGK
HSFTMAMNAFGDMTSEEFRQVMNGFQNRKPRKGKVFQEPLFYEAPRSVDWREKGYVTPVKNQGQCGSCWA
>CATH_MONKEY
FSATGALEGQMFRKTGRLISLSEQNLVDCSGPQGNEGCNGGLMDYAFQYVQDNGGLDSEESYPYEATEES
CKYNPKYSVANDTGFVDIPKQEKALMKAVATVGPISVAIDAGHESFLFYKEGIYFEPDCSSEDMDHGVLV

Could you please let me know where I am going wrong, I have had a look at the samtools documentation and have looked at similiar posts, but haven't been able to get it to work.

Any help would be most appreciated

Thank You

samtools • 5.2k views
ADD COMMENT
0
Entering edit mode

This works for me with the sample files in your question, with both samtools 0.1.19 and the current development version. Does it work for you with these sample files? If so, you may need to post the actual problematic files you're using, or extent these samples until they too fail.

ADD REPLY
0
Entering edit mode

Thank You,

I have now got the samtools to work - the problem was that there was a space between the > and the sequence ID. in both files. So, I just changes the script that is used to make the motif_seach.txt to remove the space and it now works.

Many thanks for all your help

ADD REPLY
0
Entering edit mode
11.3 years ago

you have to first, index the fasta file:

samtools faidx motif_search.txt

and the query each region

cat  output.txt | cut -c 2- | while read M; do  samtools faidx motif_search.txt ${M} ; done
ADD COMMENT
0
Entering edit mode

Many thanks for your help.

Unfortunately I still can't get it to work,

When indexing the fasta file, I get the following error:

zsh: segmentation fault  samtools faidx motif_search.txt

And when indexing the query each region, I get a number of lines of:

[fai_load] build FASTA index.

Does the fasta file have to be in fasta format - currently it has no word wrap (i.e. the sequence ID is on one line and the Sequence on the next; I suppose I could write a script to convert it to fasta format if necessary...

Or does the problem lie somewhere else altogether.

Many Thanks once again - I'm a total beginner in bioinformatics and really appreciate this help.

EDIT:

I have now reinstalled samtools. When typing samtools into the terminal, it gives me that half a page of text, telling me of the version installed and the commands that I can use... Nevertheless, when trying to index the files I still get the same errors

ADD REPLY
0
Entering edit mode

You can have different lengths for each of the sequences, just not for different lines of the same sequence. It sounds like this is a bug in the version of samtools that you're using. If you're familiar enough with compiling and debuggers, it'd be nice to recompile samtools with debugging and then run faidx inside gdb to find out where the problem is. Alternatively, post your motif_search.txt file somewhere and someone can submit a bug report for you.

ADD REPLY

Login before adding your answer.

Traffic: 2261 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6