Hi, I am trying to use the samtools (faidx) to extract certain sequences from a fasta file (actually a .txt file) using sequence IDs in a different file. However it is giving the following error:
[fai_load] build FASTA index.
xargs: samtools: terminated by signal 11
This the shell command I am using:
cut -c 2- output.txt | xargs -n 1 samtools faidx motif_search.txt
The files:
output.txt - containing specific sequence IDs only (example text)
>CATH_MOUSE
>CATH_HUMAN
motif_search.txt - containing sequence IDs and the sequence (example text)
>CATH_RAT
MWTALPLLCAGAWLLSAGATAELTVNAIEKFHFTSWMKQHQKTYSSREYSHRLQVFANNWRKIQAHNQRN
HTFKMGLNQFSDMSFAEIKHKYLWSEPQNCSATKSNYLRGTGPYPSSMDWRKKGNVVSPVKNQGACGSCW
>CATH_MOUSE
TFSTTGALESAVAIASGKMMTLAEQQLVDCAQNFNNHGCQGGLPSQAFEYILYNKGIMGEDSYPYIGKNG
QCKFNPEKAVAFVKNVVNITLNDEAAMVEAVALYNPVSFAFEVTEDFMMYKSGVYSSNSCHKTPDKVNHA
>CATH_HUMAN
MNPTLILAAFCLGIASATLTFDHSLEAQWTKWKAMHNRLYGMNEEGWRRAVWEKNMKMIELHNQEYREGK
HSFTMAMNAFGDMTSEEFRQVMNGFQNRKPRKGKVFQEPLFYEAPRSVDWREKGYVTPVKNQGQCGSCWA
>CATH_MONKEY
FSATGALEGQMFRKTGRLISLSEQNLVDCSGPQGNEGCNGGLMDYAFQYVQDNGGLDSEESYPYEATEES
CKYNPKYSVANDTGFVDIPKQEKALMKAVATVGPISVAIDAGHESFLFYKEGIYFEPDCSSEDMDHGVLV
Could you please let me know where I am going wrong, I have had a look at the samtools documentation and have looked at similiar posts, but haven't been able to get it to work.
Any help would be most appreciated
Thank You
This works for me with the sample files in your question, with both samtools 0.1.19 and the current development version. Does it work for you with these sample files? If so, you may need to post the actual problematic files you're using, or extent these samples until they too fail.
Thank You,
I have now got the samtools to work - the problem was that there was a space between the > and the sequence ID. in both files. So, I just changes the script that is used to make the motif_seach.txt to remove the space and it now works.
Many thanks for all your help