Entering edit mode
5.0 years ago
shelley.w.peterson
▴
10
I have a list of sequence names (A1, A2, A3, A1, A1, A2 etc) and a fasta file with the names and sequences, and I am trying to find a way to replace each item on the list with the corresponding sequence from the fasta file.
I've used:
test <- sequences[names(sequences) %in% list]
which just extracts A1, A2, A3 and doesn't give me the remaining ones. What am I missing?
list of sequence names:
A1
A2
A3
A1
A1
A2
fasta file:
>A1
ATCATC
>A2
CCCGGG
>A3
GTGTGT
>A4
TCTATC
>A5
ATCTAC
output:
>A1
ATCATC
>A2
CCCGGG
>A3
GTGTGT
Desired output:
ATCATC
CCCGGG
GTGTGT
ATCATC
ATCATC
CCCGGG
Please give representative in/output.
list of sequence names:
fasta file:
output:
Desired output:
Is there an instruction segment for how to properly format a post? I was proud enough of myself for thinking to put in "< br >" when pressing the enter button didn't work. I'm a biologist not a programmer, so I don't know these things.
Apologies, we do not have a manual for the formatting bar yet. You did a great job with the
<br>
tags, but the formatting bar is your toolbelt for most tasks.Try dedicated fasta/fastq manipulation tools such as: seqtk, seqkit etc. @ shelley.w.peterson. R code as follows:
Thanks so much!!!! As someone who is new to coding, sometimes it's hard to figure out if I'm using the wrong tool/command or if I'm using the correct one the wrong way -_-'
I started that way and learnt on the way. Keep visiting biostars @ shelley.w.peterson