Entering edit mode
6.6 years ago
Janey
▴
30
Hi
I already used the following command to extract sequence from fasta file by IDs list.
cut -c 2- ID.text | xargs -n 1 samtools faidx in.fasta > out.fasta
but now i get this error:
xargs: samtools: No such file or directory
I tried to use the seqkit, but seqkit is not worked in my unix system.
I used following commands:
perl -e 'open(F,"File1.txt");while(<F>){/(\S+)/; $k{$1}++}; while(<>){if(/>\s*(\S+?)(\.| )/){if($k{$1}){$k=1}else{$k=0}; } print if $k==1;}' File2.fa
awk -F'[ .]' 'NR==FNR{a[$0]; next}/^>/{p=$2 in a}p' file1 file2
grep -x -F -A 1 -f 'File 2' 'File 1'
cat IDs.txt | awk '{gsub("_","\\_",$0);$0="(?s)^>"$0".*?(?=\\n(\\z|>))"}1' | pcregrep -oM -f - f1.fasta
alias FASTAgrep="awk '{gsub(\"_\",\"\\\_\",\$0);\$0=\"(?s)^>\"\$0\".*?(?=\\\n(\\\z|>))\"}1' | pcregrep -oM -f -"
cat IDs.txt | FASTAgrep f1.fasta
and ........
But in all of these cases, the output file was either empty or contains a full input file. So confused please help me.
install samtools and/or set your PATH https://stackoverflow.com/questions/14637979/
Do you suggest to download and install the samtools again?
if samtools is installed so
means that it's not in your PATH. you'll need to update your $PATH variable
samtools was installed Successfully. and this command "cut -c 2- ID.text | xargs -n 1 samtools faidx in.fasta > out.fasta" was worked. my unis system was changed.
Assuming you mean here "the computer system of my university was changed" then indeed it looks like you need to reinstall samtools. We don't know what has changed on your system though since you don't provide a lot of information.
what is this supposed to do ?
I wanted to use other methods (commands) to extract sequences from fasta file by IDs list. In addition, I must say that I am a biologist and so I'm not familiar with the programming language. Thanks for helping me with more details about commands.
How To Extract A Sequence From A Big (6Gb) Multifasta File ?
Extract sequence with header from a fasta file with specific ID given in another file
extract sequences from multi fasta using partial ID
Extract A Group Of Fasta Sequences From A File
Extract Sequence From Fasta File Using Ids From A Separate File
....
@OP: I think you tried too many. please post few entries from list and matching records from fasta. There are easy and established codes (methods) to do whatever you are trying. It is as simple as
seqtk subseq <input.fa> <ids.list>
.output:
input: